Website Structure Analysis Through Sentiment Data Evaluation

Website Structure Analysis Through Sentiment Data Evaluation
Sentiment-Coloured Network Graph of the Mobiliar Website. Own illustration using PyVis Framework.

Finding patterns in the structure of the Mobiliar website that correlate with the sentiment analysis of its pages.


Company websites are one of the most important sources of information for customers. The lifetime of many pages on a company website often lasts for many years. For this reason, it is important that a certain standard is maintained when creating website content. Companies with extensive websites, such as die Schweizerische Mobiliar Versicherungsgesellschaft AG (hereafter referred to as Mobiliar), have content managers who are responsible for the content of the website. No computer-aided analysis methods or artificial intelligence are used in this work, neither in the creation nor in the review of the texts.

The aim of this study was to use data analysis to prove whether patterns exist in the structure of the Mobiliar website that are related to the sentiment values of the pages. This could reveal weak points such as certain topic areas on the website, which consistently achieve more negative sentiment values.

To measure the sentiment, a sentiment analysis was carried out on the texts of the pages. The structure of the paper and the procedure for developing the project is divided into two parts. In the first part, the sentiment analysis data was obtained. The second part comprises the evaluation of the analysed data with the aim of answering the research question. All the steps that were taken in the development of this project are documented in the paper.

Using a specially developed- as well as two other standardised clustering methods, the pages of the Mobiliar website were divided into clusters. Significant correlations were observed by analysing the correlation values between the sentiment analysis values of the individual pages and the average sentiment values of the pages in the same cluster.

This means that by clustering the pages, it was possible to identify a pattern in the structure of the Mobiliar website network in which the sentiment values correlate to their location in the network. This made it possible to answer the research question of whether patterns exist in the structure of the Mobiliar website in which there is a correlation between the sentiment that a page carries and the subject area that the page deals with.

By developing a clustering method based on clustering the website using the elements in the URL path, a way was found to identify textual weaknesses in relation to sentiment analysis results from the text of the pages on the Mobiliar website.