| dc.description.abstract | With the rapid growth of web technologies, individuals and organizations are 
increasingly using public opinions in blogs, forums, review sites, social networks, etc. 
for expressing their views and opinions. These reviews are very useful for service 
providers, manufactures and organizations in making informed decisions and 
improving their service. However, the huge volume of reviews on the social media 
grows so rapidly and becoming increasingly difficult for users to analyze and extract 
relevant information. Therefore, an automated sentiment analysis is needed. 
In this research, we presented a multiscale sentence-level sentiment analysis for 
Tigrigna online posts using a supervised machine learning approach. The multiscale 
Tigrigna sentiment analysis model classifies a given sentence into five predefined 
classes: very positive (2), positive (1), neutral (0), negative (-1) and very negative (-2). 
We have used three supervised machine-learning algorithms: Naïve Bayes (NB), 
Maximum Entropy (MaxEnt) and Support Vector Machine (SVM) with unigram, 
bigram, trigram and hybrid of unigram and bigram variants of N-gram as a feature. The 
proposed model contains different components like preprocessing (tokenization, 
normalization, stop word removal), morphological analysis (lemmatizing), feature 
extraction, training a machine learning algorithms, classification and evaluation of the 
result using evaluation metrics. 
For conducting the experiments, 1500 Tigrigna sentences are collected from different 
sources. Due to the morphological complexity of the language, preprocessing 
techniques have been applied in order to clean noisy data and reduce sparseness and 
dimensionality of the dataset. After preprocessing, the dataset is lemmatized, before it 
is given to training phase of the experiment. The experimental results show the SVM 
algorithm with unigram language model outperforms all algorithms with 71% accuracy. 
In conclusion, despite the language morphological complexity and lack of effective 
morphological analysis tools, the achieved experimental results are promising.
However, we are convinced that the results could improve further with a larger, pre annotated and cleaned corpus. | en_US |