Sustainable Tourism Index

Joëlle Duff


Abstract

In this notebook I create a composite index to measure sustainability in the tourism industry. The Sustainable Tourism Index was composed using Factor Analysis, where various observable variables were combined to form an unobservable factor, "Sustainable Tourism". I explain the importance and relevance of such an index, before discussing the method used to create it. I find that factor analysis is extremely suitable for creating such a composite index, since differential weighting is representative and realistic in attributing weight to the factor. Alternatively, instead of using several individual indices, it may be more valuable to calculate one index for all years, such that the scores can be compared over multiple years more reliably.

image

Research Question

Can we create a reliable and representative composite index which measures sustainability in the tourism industry?

I attempt to create such an index with data that is freely available to the public

Motivation - What is Sustainable Tourism and why do we need it?

The impact of Tourism

In the last decades tourism has become increasingly accessible to people around the world; with travel costs decreasing, and disposable income steadily rising, it has become more affordable for the population to travel and discover new places. In fact, in the 1950s 25 million international arrivals were recorded, whereas by 2015 this amount had risen to a staggering 1.2 billion arrivals (Sharpley and Telfer, 2014 and WTTC, 2016). With the tourism industry now contributing almost 10% to global GDP annually, and employing over 300 million people worldwide, it is not surprising that it continues to have a tremendous impact on economic development. Besides GDP and employment, the industry also directly affects investments, export, consumption, and even wealth distribution. Distinct from many other sectors, the tourism industry provides an immensely diverse range of activities and products, thus generating an extensive value chain which impacts multiple other sectors.

The need for Sustainability

Unfortunately the development of tourism is paired with various production externalities which can damage long-term growth. Over-exploitation of and competition for natural resources directly affect the livelihood of local communities and stress regional biodiversity. Creaco and Querini (2003) note that the negative and irreversible effect of unplanned and uncontrolled growth actually destroys the unique natural and social resource foundation of tourism. Consequently, it is important to balance the use of social, environmental, and economic factors of tourism. With this in mind, the tourism industry is increasingly becoming more sustainable, and global institutions are focussing on stimulating this sustainable development through specific programmes such as the Sustainable Tourism – Eliminating Poverty (ST-EP) Initiative and the 10-Year Framework of Programmes on Sustainable Consumption and Production Patterns Sustainable Tourism Programme (UNWTOb, 2016 and UNEP, 2016).

In [1]:
from IPython.display import YouTubeVideo
YouTubeVideo("JFbbKbdqoJg")
Out[1]:

Method - How can we measure Sustainable Tourism?

Measuring Sustainable Tourism

With the UN introducing 2017 as the International Year of Sustainable Tourism (Video Link for IY 2017), focus is increasingly on the measurement of sustainability in the industry, and on tracking its development. Since sustainability is a complex and multivariate concept, different methods have been proposed to quantify a composite index for the level of sustainable tourism. Such a composite would properly encase the multitude of elements inherent to sustainability and has been attempted in the following cases; Tourism Penetration Index, the Barometer of Tourism Sustainability, the Sustainable Tourism Index, the Sustainable Tourism Benchmarking Tool, and the Vectorial Dynamic Composite Indicator (McElroy & De Albuquerque, 1998; Ko, 2005; Pulido Fernández & Sánchez Rivero, 2009; Cernat & Gourdon, 2012; & Blancas et al., 2016). Unfortunately however, many of the variables needed to create these composites are not (well) measured by data gathering institutions. With this in mind, I attempt to create a new index which is based on variables that are currently measured and available.

Data Sources

After reviewing various data sources I decided that data from the World Economic Forum (WEF) and Yale were most relevant for this analysis. These organisations have collected data on the sustainable development of the tourism industry and on environmental performance, respectively, and are reliable and valid sources of data. The WEF publishes a Travel and Tourism Competitiveness Report every few years, which includes specific data on the tourism industry and sustainability. Yale developed the Environmental Performance Index (EPI), which is an index composed of environmental indicators, and publishes new scores each year on a country level. Data from 2007, 2008, 2009, 2011, 2013, and 2015 will be used from 144 countries.

Factor Analysis

Factor analysis is a method often used to create composite indices, in particular for concepts of a complex social nature. It has the capacity to analyze correlations between observable variables and an unobservable variable, or factor. Factor analysis commences with observed variables, which are assumed to have some linear relation to the common factor, and then derives likely component variables (Mulaik, 2010). Instead of just using equal weighting for all variables, factor analysis filters out variables which are not relevant and attributes more realistic and accurate weightings to each of them. See these webpages for more information (Short Introduction to Factor Analysis, A Beginner's Guide to Factor Analysis, Confirmatory Factor Analysis, Factor Analysis in Python)

This study will use a confirmatory approach, combining observable variables which are expected to represent the common factor ‘Sustainable Tourism’. This factor will be composed using differentially weighted variables, similar to the ST index and Sustainable Tourism Benchmarking Tool created by Pulido-Fernandez & Sanchez-Rivero (2009) and Cernat & Gourdon (2012), respectively. This index will be based on different observable variables which are currently available for research. The number of selected variables is large during this stage, but will be reduced during the factor analysis, when their individual correlation is determined.

Observable Variables

Sustainability incorporates three common dimensions, the economic, environmental, and social dimensions. This index will include the environmental and social aspects only, to avoid endogeneity when relating the index to economic development (this is a recommendation for the future which will be discussed later). Additionally, other relevant WEF report variables are included which could contribute to the factor, but which are not necessarily attributable to a specific dimension. The following equation will represent the final factor of Sustainable Tourism:

$$ Sustainable \ Tourism \ Index_{ij} = \beta_1 \ S_{1ij} + \beta_2 \ S_{2ij} + ... + \beta_n \ S_{nij}$$

where i and j respectively denote country and year, and where $S_n$ represents some of the independent variables mentioned above. In this equation, $\beta_n$ represents the so-called loading factors, which are better known as the correlations between the observable and unobservable variables.

The following factors are included in the dataset I composed:

Environmental
  1. Protected natural area Nationally protected areas as percentage of total land area [WEF]
  2. EPI The Environmental Performance Index (EPI) ranks countries’ performance on high-priority environmental issues in two areas: protection of human health and protection of ecosystems. Within these two policy objectives the EPI scores country performance in nine issue areas comprised of 20 indicators. Indicators in the EPI assess countries’ proximity to internationally established targets or, in the absence of agreed-upon targets, how individual nations compare relative to the best performing countries [EPI Yale]
Social
  1. Local satisfaction with tourism Are foreign travelers and in particular tourists welcome in your country? (1 = no, citizens in my country do not welcome foreign travelers and tourists, 7 = yes, my country’s citizens are open and welcome foreign travelers and tourists) [WEF]
  2. World heritage sites Number of World Heritage sites in the country [WEF, UNESCO original]
  3. Health services capacity Physician density per 1,000 population [WEF, WHO original]
  4. Gender inequality Employment-to-population ratio of Females [ILO modeled estimates]
WEF Travel and Tourism Competitiveness Reports
  1. Government prioritisation of the travel and tourism industry Is your country’s government taking the necessary steps to ensure that the T&T sector is being developed in a sustainable way?(1 = no, development of the sector does not take into account issues related to environmental protection and sustainable development,7 = yes, sustainable development issues are at the core of the government’s strategy with regard to the industry’s development) [WEF]
  2. Stringency of environmental regulations How stringent is your country’s environmental regulation? (1 = lax compared to most countries, 7 = among the world’s most stringent) [WEF]
  3. Sustainability of travel and tourism industry How would you assess the effectiveness of your government’s efforts to ensure that the T&T sector is being developed in a sustainable way? (1 = very ineffective—development of the sector does not take into account issues related to environmental protection and sustainable development; 7 = very effective—issues related to environmental protection and sustainable development are at the core of the government’s strategy) [WEF]

Results

Answer

I find that Factor Analysis works well as a method to create a composite index with differential weighting. It is able to process large datasets, with many observations per variable and country. The resulting ST Index is realistic and representative of many underlying variables. Using the same loading factors over the same variables each year, would provide an ST index which is comparable over many years.

Prepare Data

Below I first import necessary packages and import the data. I also convert data to numeric, which is necessary for further analysis. The table you see below is of 2015, as an example of the individual datasets. Columns represent the variables mentioned earlier, and the rows represent the countries. Row 0 is Albania, row 143 Zimbabwe.

In [3]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.decomposition import PCA, FactorAnalysis

xls_file2007 = pd.ExcelFile("C:/Users/asus/Documents/Master/Thesis/Data/FactorAnalysis per Year/PythonSustainable2007.xlsx")
pd.to_numeric(xls_file2007, errors='coerce')

xls_file2007
xls_file2007.sheet_names

Sustain2007 = xls_file2007.parse('Blad1')

xls_file2008 = pd.ExcelFile("C:/Users/asus/Documents/Master/Thesis/Data/FactorAnalysis per Year/PythonSustainable2008.xlsx")
pd.to_numeric(xls_file2008, errors='coerce')

xls_file2008
xls_file2008.sheet_names

Sustain2008 = xls_file2008.parse('Blad1')

xls_file2009 = pd.ExcelFile("C:/Users/asus/Documents/Master/Thesis/Data/FactorAnalysis per Year/PythonSustainable2009.xlsx")
pd.to_numeric(xls_file2009, errors='coerce')

xls_file2009
xls_file2009.sheet_names

Sustain2009 = xls_file2009.parse('Blad1')

xls_file2011 = pd.ExcelFile("C:/Users/asus/Documents/Master/Thesis/Data/FactorAnalysis per Year/PythonSustainable2011.xlsx")
pd.to_numeric(xls_file2011, errors='coerce')

xls_file2011
xls_file2011.sheet_names

Sustain2011 = xls_file2011.parse('Blad1')

xls_file2013 = pd.ExcelFile("C:/Users/asus/Documents/Master/Thesis/Data/FactorAnalysis per Year/PythonSustainable2013.xlsx")
pd.to_numeric(xls_file2013, errors='coerce')

xls_file2013
xls_file2013.sheet_names

Sustain2013 = xls_file2013.parse('Blad1')

xls_file2015 = pd.ExcelFile("C:/Users/asus/Documents/Master/Thesis/Data/FactorAnalysis per Year/PythonSustainable2015.xlsx")
pd.to_numeric(xls_file2015, errors='coerce')

xls_file2015
xls_file2015.sheet_names

Sustain2015 = xls_file2015.parse('Blad1')

Sustain2015
Out[3]:
ProtectArea EPI Heritage HealthCap GenderIneq GovPrior StringEnv SustInd
0 9.5 73.14 NaN 1.145 33.356 4.200105 2.678866 3.994575
1 7.4 71.94 0.5 1.207 13.672 3.834122 2.789911 3.165064
2 12.1 50.83 NaN 0.166 55.589 2.757122 2.567622 2.167017
3 6.7 79.80 4.0 3.155 44.642 4.614444 3.409176 3.671476
4 8.1 81.24 NaN 2.687 44.469 5.018530 3.336461 3.711837
5 15.1 87.58 14.0 3.273 55.014 5.675631 5.563674 5.002163
6 23.6 85.68 NaN 4.830 51.825 6.138436 6.199835 5.677030
7 7.4 83.68 NaN 3.432 58.204 5.816292 3.944133 4.539214
8 6.8 70.86 NaN 0.913 37.565 5.399284 4.379755 4.328502
9 4.2 43.66 1.0 0.356 40.966 3.979071 3.166646 3.023210
10 0.1 56.95 NaN 1.811 55.767 6.546030 4.433184 4.987528
11 24.5 79.56 NaN 2.993 44.450 4.983415 5.996967 4.879974
12 NaN 43.83 NaN NaN 69.297 NaN NaN NaN
13 20.8 71.33 1.0 0.473 61.122 3.606636 3.662446 3.582253
14 NaN 61.44 NaN NaN 27.035 NaN NaN NaN
15 37.2 70.84 1.0 0.336 58.048 5.655652 4.399339 4.993260
16 26.0 80.11 7.0 1.891 50.405 4.283221 5.437597 3.568283
17 NaN 63.85 NaN NaN 49.912 NaN NaN NaN
18 35.4 82.52 2.0 3.812 44.513 4.392558 3.474735 3.664255
19 15.2 43.76 NaN 0.047 74.968 5.179798 3.388413 4.088744
20 4.9 42.76 NaN 0.028 83.016 3.806872 2.932308 3.122032
21 0.2 51.75 NaN 0.295 46.470 5.605283 3.970235 4.390600
22 23.8 51.42 NaN 0.227 75.427 5.322787 3.195513 4.259412
23 10.9 57.33 2.0 0.077 67.315 4.531674 3.689489 3.762536
24 7.0 85.06 9.0 2.068 57.141 5.473536 5.115687 4.941287
25 16.6 37.86 1.0 0.037 59.507 4.765618 3.709688 4.087814
26 15.0 78.60 NaN 1.024 47.156 4.551321 4.578674 4.222706
27 16.1 66.26 12.0 1.456 61.224 4.821153 3.882688 4.608015
28 20.8 77.81 2.0 1.471 51.061 4.974825 3.716867 4.467664
29 22.6 80.98 3.0 1.113 41.043 6.076979 5.167185 5.382081
... ... ... ... ... ... ... ... ...
114 3.4 87.64 NaN 1.921 57.186 6.310279 5.431661 5.828860
115 36.1 83.69 2.0 3.000 44.782 3.571068 4.885139 3.217039
116 54.9 86.76 1.0 2.516 46.930 4.279142 5.181228 3.827235
117 6.6 70.75 3.5 0.776 33.491 5.923585 4.915522 5.127805
118 25.3 88.73 4.0 3.695 40.019 6.382867 4.553023 4.847801
119 15.4 66.27 2.0 0.680 27.780 6.301579 4.803951 5.373160
120 15.2 70.07 1.0 0.911 34.961 4.198647 3.195488 3.507000
121 3.0 60.65 NaN 0.170 28.807 5.264563 4.078149 4.542394
122 13.9 89.74 1.5 3.265 56.435 5.147157 5.866208 4.990514
123 26.3 85.68 3.0 3.938 59.793 5.842593 6.334778 5.672952
124 NaN 64.29 NaN NaN 8.059 NaN NaN NaN
125 19.0 72.58 NaN 2.050 48.111 5.504996 4.817265 4.693491
126 4.8 72.39 1.0 1.899 53.587 5.191489 4.252632 4.239583
127 31.7 57.93 3.5 0.008 71.675 4.919903 3.695608 4.334308
128 16.4 68.70 2.0 0.393 62.538 6.091005 3.562656 4.542452
129 NaN 55.92 NaN NaN 25.412 NaN NaN NaN
130 10.1 76.40 NaN 1.175 50.421 4.075126 3.176598 3.293542
131 4.8 76.83 1.0 1.222 19.659 6.098274 3.876902 4.288198
132 2.1 66.25 1.0 1.711 26.570 5.364635 3.989465 4.684310
133 11.4 57.36 2.0 0.117 80.841 4.989805 3.352473 4.365294
134 NaN 79.01 NaN NaN 47.956 NaN NaN NaN
135 15.5 70.89 NaN 1.930 37.713 6.710087 5.525940 6.485692
136 23.4 89.33 4.5 2.790 54.003 5.532498 5.452449 4.949224
137 15.1 84.53 12.5 2.452 53.123 5.258312 5.359495 4.985988
138 2.6 73.64 NaN 3.736 49.803 5.759555 4.527127 4.918309
139 49.5 77.64 1.0 1.939 47.502 2.477484 3.256868 1.988750
140 4.7 57.97 2.5 1.159 72.303 4.838365 3.039912 3.814216
141 1.1 48.31 1.0 0.197 17.658 2.833609 1.829067 2.205431
142 37.8 66.05 1.0 0.066 64.302 5.899775 4.290074 5.140796
143 27.2 59.46 2.0 0.062 73.979 5.280152 3.909862 3.934177

144 rows × 8 columns

Descriptive Statistics

In [5]:
Sustain2007
Sustain2007.describe()
Out[5]:
ProtectArea EPI Satisfaction Heritage HealthCap GenderIneq GovPrior StringEnv
count 109.000000 142.000000 124.000000 105.000000 116.00000 143.000000 124.000000 124.000000
mean 11.131193 52.283239 6.322581 7.066667 1.84569 46.543587 4.474194 4.040323
std 10.407971 16.324255 0.400982 8.511606 1.29604 15.832559 0.942309 1.197417
min 0.000000 17.790000 4.900000 1.000000 0.10000 10.195000 2.400000 2.200000
25% NaN NaN NaN NaN NaN NaN NaN NaN
50% NaN NaN NaN NaN NaN NaN NaN NaN
75% NaN NaN NaN NaN NaN NaN NaN NaN
max 63.800000 86.690000 6.900000 43.000000 4.50000 85.554000 6.500000 6.700000
In [4]:
Sustain2008
Sustain2008.describe()
Out[4]:
ProtectArea EPI Satisfaction Heritage HealthCap GenderIneq GovPrior StringEnv SustInd
count 127.000000 142.000000 129.000000 69.000000 127.000000 143.000000 129.000000 129.000000 129.000000
mean 12.121496 52.830000 6.336977 2.608696 1.730787 46.731713 5.048837 3.958527 4.685349
std 11.093054 16.362286 0.381694 2.680087 1.257168 15.944144 0.851222 1.033398 0.804699
min 0.070000 17.920000 4.850000 1.000000 0.020000 11.046000 2.940000 2.280000 2.740000
25% NaN NaN NaN NaN NaN NaN NaN NaN NaN
50% NaN NaN NaN NaN NaN NaN NaN NaN NaN
75% NaN NaN NaN NaN NaN NaN NaN NaN NaN
max 62.950000 88.360000 6.820000 15.000000 4.400000 85.552000 6.710000 6.410000 6.340000
In [11]:
Sustain2009
Sustain2009.describe()
Out[11]:
ProtectArea EPI Satisfaction Heritage HealthCap GenderIneq GovPrior StringEnv SustInd
count 133.000000 142.000000 133.000000 73.000000 132.000000 143.000000 133.000000 134.000000 133.000000
mean 12.520977 52.948521 6.357068 2.602740 1.761591 46.562028 5.463083 4.017761 4.765789
std 11.685139 16.343020 0.378261 2.701541 1.335420 15.797057 0.821996 0.991500 0.845084
min 0.100000 18.050000 4.830000 1.000000 0.020000 10.206000 3.050000 2.180000 2.490000
25% NaN NaN NaN NaN NaN NaN NaN NaN NaN
50% NaN NaN NaN NaN NaN NaN NaN NaN NaN
75% NaN NaN NaN NaN NaN NaN NaN NaN NaN
max 65.700000 88.790000 6.870000 15.000000 5.000000 84.199000 6.850000 6.360000 6.480000
In [12]:
Sustain2011
Sustain2011.describe()
Out[12]:
ProtectArea EPI Satisfaction Heritage HealthCap GenderIneq GovPrior StringEnv SustInd
count 139.000000 142.000000 139.000000 74.000000 138.000000 143.000000 139.000000 139.000000 139.000000
mean 11.251295 53.404366 6.182086 2.675676 1.746522 46.591413 5.253525 4.072302 4.353741
std 9.577232 16.272706 0.422915 2.809515 1.363433 15.832780 0.879882 1.063065 0.853234
min 0.070000 18.350000 4.590000 0.000000 0.010000 8.388000 2.660000 1.910000 2.060000
25% NaN NaN NaN NaN NaN NaN NaN NaN NaN
50% NaN NaN NaN NaN NaN NaN NaN NaN NaN
75% NaN NaN NaN NaN NaN NaN NaN NaN NaN
max 50.180000 88.170000 6.900000 15.000000 5.350000 83.593000 6.840000 6.630000 6.250000
In [13]:
Sustain2013
Sustain2013.describe()
Out[13]:
EPI Satisfaction Heritage HealthCap GenderIneq GovPrior StringEnv SustInd
count 142.000000 139.000000 77.000000 139.000000 143.000000 139.000000 139.000000 139.000000
mean 68.921620 6.140504 2.727273 1.770072 46.798161 5.192878 4.095971 4.332734
std 14.191319 0.440358 2.904789 1.393333 15.617317 0.889771 1.106678 0.855210
min 35.130000 4.090000 1.000000 0.010000 8.148000 2.400000 1.470000 2.100000
25% NaN NaN NaN NaN NaN NaN NaN NaN
50% NaN NaN NaN NaN NaN NaN NaN NaN
75% NaN NaN NaN NaN NaN NaN NaN NaN
max 90.590000 6.830000 16.000000 6.170000 83.110000 6.790000 6.440000 6.250000
In [14]:
Sustain2015
Sustain2015.describe()
Out[14]:
ProtectArea EPI Heritage HealthCap GenderIneq GovPrior StringEnv SustInd
count 135.000000 142.000000 80.00000 135.000000 143.000000 136.000000 136.000000 136.000000
mean 15.263704 70.170915 2.58125 1.762081 47.141972 5.148392 4.218566 4.349224
std 11.296470 13.737734 2.73450 1.435403 15.581087 0.901470 1.012196 0.837868
min 0.100000 36.560000 0.50000 0.008000 8.059000 2.321701 1.829067 1.988750
25% NaN NaN NaN NaN NaN NaN NaN NaN
50% NaN NaN NaN NaN NaN NaN NaN NaN
75% NaN NaN NaN NaN NaN NaN NaN NaN
max 54.900000 90.740000 14.00000 7.739000 83.862000 6.710087 6.340268 6.485692

Findings

From the descriptives above we can see that most datapoints lie between 0 and 7, because they originate from survey questions which are measured on a likert scale. This is particularly useful because this means the ST index will also result in a balanced score. However, there are also extreme outliers which will affect the ST index later on. The variables ProtectArea, EPI, and GenderIneq all show much higher values, and will significantly alter the ST index scores when they are included in the calculation. This is not necessarily a problem, but must be taken into account when interpreting the ST Index scores, and comparing them over multiple years.

Factor Analysis

As I mentioned earlier, to create a composite index I must first perform the factor analyis. This analysis will result in a list of factor loadings, which represent the different weights I can attribute to each variable. Again, I only show the data for 2015, since the method is the same in each period.

In [6]:
from sklearn import decomposition, preprocessing
import numpy as np
import pandas as pd

data2007 = Sustain2007
data2007 = data2007[~np.isnan(data2007).any(axis=1)] # take out values of NaN to ensure only numeric data remains
data_normal2007 = preprocessing.scale(data2007) #normalisation
fa2007 = decomposition.FactorAnalysis(n_components = 1) # decomposition
fa2007.fit(data_normal2007) # Factor analysis

data2008 = Sustain2008
data2008 = data2008[~np.isnan(data2008).any(axis=1)] 
data_normal2008 = preprocessing.scale(data2008) 
fa2008 = decomposition.FactorAnalysis(n_components = 1) 
fa2008.fit(data_normal2008) 

data2009 = Sustain2009
data2009 = data2009[~np.isnan(data2009).any(axis=1)] 
data_normal2009 = preprocessing.scale(data2009) 
fa2009 = decomposition.FactorAnalysis(n_components = 1) 
fa2009.fit(data_normal2009)  

data2011 = Sustain2011
data2011 = data2011[~np.isnan(data2011).any(axis=1)] 
data_normal2011 = preprocessing.scale(data2011)
fa2011 = decomposition.FactorAnalysis(n_components = 1) 
fa2011.fit(data_normal2011) 

data2013 = Sustain2013
data2013 = data2013[~np.isnan(data2013).any(axis=1)] 
data_normal2013 = preprocessing.scale(data2013) 
fa2013 = decomposition.FactorAnalysis(n_components = 1) 
fa2013.fit(data_normal2013) 
    
print Sustain2015

data2015 = Sustain2015
data2015 = data2015[~np.isnan(data2015).any(axis=1)]
data_normal2015 = preprocessing.scale(data2015)
fa2015 = decomposition.FactorAnalysis(n_components = 1)
fa2015.fit(data_normal2015)
for score in fa2015.score_samples(data_normal2015):
    print -score # Factor analysis scores
     ProtectArea    EPI  Heritage  HealthCap  GenderIneq  GovPrior  StringEnv  \
0            9.5  73.14       NaN      1.145      33.356  4.200105   2.678866   
1            7.4  71.94       0.5      1.207      13.672  3.834122   2.789911   
2           12.1  50.83       NaN      0.166      55.589  2.757122   2.567622   
3            6.7  79.80       4.0      3.155      44.642  4.614444   3.409176   
4            8.1  81.24       NaN      2.687      44.469  5.018530   3.336461   
5           15.1  87.58      14.0      3.273      55.014  5.675631   5.563674   
6           23.6  85.68       NaN      4.830      51.825  6.138436   6.199835   
7            7.4  83.68       NaN      3.432      58.204  5.816292   3.944133   
8            6.8  70.86       NaN      0.913      37.565  5.399284   4.379755   
9            4.2  43.66       1.0      0.356      40.966  3.979071   3.166646   
10           0.1  56.95       NaN      1.811      55.767  6.546030   4.433184   
11          24.5  79.56       NaN      2.993      44.450  4.983415   5.996967   
12           NaN  43.83       NaN        NaN      69.297       NaN        NaN   
13          20.8  71.33       1.0      0.473      61.122  3.606636   3.662446   
14           NaN  61.44       NaN        NaN      27.035       NaN        NaN   
15          37.2  70.84       1.0      0.336      58.048  5.655652   4.399339   
16          26.0  80.11       7.0      1.891      50.405  4.283221   5.437597   
17           NaN  63.85       NaN        NaN      49.912       NaN        NaN   
18          35.4  82.52       2.0      3.812      44.513  4.392558   3.474735   
19          15.2  43.76       NaN      0.047      74.968  5.179798   3.388413   
20           4.9  42.76       NaN      0.028      83.016  3.806872   2.932308   
21           0.2  51.75       NaN      0.295      46.470  5.605283   3.970235   
22          23.8  51.42       NaN      0.227      75.427  5.322787   3.195513   
23          10.9  57.33       2.0      0.077      67.315  4.531674   3.689489   
24           7.0  85.06       9.0      2.068      57.141  5.473536   5.115687   
25          16.6  37.86       1.0      0.037      59.507  4.765618   3.709688   
26          15.0  78.60       NaN      1.024      47.156  4.551321   4.578674   
27          16.1  66.26      12.0      1.456      61.224  4.821153   3.882688   
28          20.8  77.81       2.0      1.471      51.061  4.974825   3.716867   
29          22.6  80.98       3.0      1.113      41.043  6.076979   5.167185   
..           ...    ...       ...        ...         ...       ...        ...   
114          3.4  87.64       NaN      1.921      57.186  6.310279   5.431661   
115         36.1  83.69       2.0      3.000      44.782  3.571068   4.885139   
116         54.9  86.76       1.0      2.516      46.930  4.279142   5.181228   
117          6.6  70.75       3.5      0.776      33.491  5.923585   4.915522   
118         25.3  88.73       4.0      3.695      40.019  6.382867   4.553023   
119         15.4  66.27       2.0      0.680      27.780  6.301579   4.803951   
120         15.2  70.07       1.0      0.911      34.961  4.198647   3.195488   
121          3.0  60.65       NaN      0.170      28.807  5.264563   4.078149   
122         13.9  89.74       1.5      3.265      56.435  5.147157   5.866208   
123         26.3  85.68       3.0      3.938      59.793  5.842593   6.334778   
124          NaN  64.29       NaN        NaN       8.059       NaN        NaN   
125         19.0  72.58       NaN      2.050      48.111  5.504996   4.817265   
126          4.8  72.39       1.0      1.899      53.587  5.191489   4.252632   
127         31.7  57.93       3.5      0.008      71.675  4.919903   3.695608   
128         16.4  68.70       2.0      0.393      62.538  6.091005   3.562656   
129          NaN  55.92       NaN        NaN      25.412       NaN        NaN   
130         10.1  76.40       NaN      1.175      50.421  4.075126   3.176598   
131          4.8  76.83       1.0      1.222      19.659  6.098274   3.876902   
132          2.1  66.25       1.0      1.711      26.570  5.364635   3.989465   
133         11.4  57.36       2.0      0.117      80.841  4.989805   3.352473   
134          NaN  79.01       NaN        NaN      47.956       NaN        NaN   
135         15.5  70.89       NaN      1.930      37.713  6.710087   5.525940   
136         23.4  89.33       4.5      2.790      54.003  5.532498   5.452449   
137         15.1  84.53      12.5      2.452      53.123  5.258312   5.359495   
138          2.6  73.64       NaN      3.736      49.803  5.759555   4.527127   
139         49.5  77.64       1.0      1.939      47.502  2.477484   3.256868   
140          4.7  57.97       2.5      1.159      72.303  4.838365   3.039912   
141          1.1  48.31       1.0      0.197      17.658  2.833609   1.829067   
142         37.8  66.05       1.0      0.066      64.302  5.899775   4.290074   
143         27.2  59.46       2.0      0.062      73.979  5.280152   3.909862   

      SustInd  
0    3.994575  
1    3.165064  
2    2.167017  
3    3.671476  
4    3.711837  
5    5.002163  
6    5.677030  
7    4.539214  
8    4.328502  
9    3.023210  
10   4.987528  
11   4.879974  
12        NaN  
13   3.582253  
14        NaN  
15   4.993260  
16   3.568283  
17        NaN  
18   3.664255  
19   4.088744  
20   3.122032  
21   4.390600  
22   4.259412  
23   3.762536  
24   4.941287  
25   4.087814  
26   4.222706  
27   4.608015  
28   4.467664  
29   5.382081  
..        ...  
114  5.828860  
115  3.217039  
116  3.827235  
117  5.127805  
118  4.847801  
119  5.373160  
120  3.507000  
121  4.542394  
122  4.990514  
123  5.672952  
124       NaN  
125  4.693491  
126  4.239583  
127  4.334308  
128  4.542452  
129       NaN  
130  3.293542  
131  4.288198  
132  4.684310  
133  4.365294  
134       NaN  
135  6.485692  
136  4.949224  
137  4.985988  
138  4.918309  
139  1.988750  
140  3.814216  
141  2.205431  
142  5.140796  
143  3.934177  

[144 rows x 8 columns]
14.7061497892
10.2177005086
15.9361560942
10.4323743157
10.5203364775
10.4850036784
10.0768557718
11.8455553418
7.99375141339
10.8270150822
11.3354351073
13.5074231009
7.56351837738
9.70165849914
7.85720398851
8.22677387619
9.41138494557
14.68918755
9.78290743257
9.60903592107
7.52788558397
12.7148369576
11.2204921834
8.92145403222
7.82904868378
7.28079664229
9.46906474326
11.1752901233
7.93663770009
11.0123661723
9.14708733253
10.7790398632
8.78040340944
7.50793899501
11.0642739156
8.68999048737
13.1402373698
9.24399304109
9.96430965721
11.9718324693
11.2424545214
7.47883593277
11.3514853251
7.99461269791
12.408455958
9.5963323453
8.88133733287
11.1220049164
9.35522012995
8.12239326226
7.59089140031
8.1961971003
10.1524898199
8.66490302645
9.71010209183
15.2408019297
8.3035589637
11.8883049085
13.3120763489
9.80057486959
8.85017642484
11.1311891794
9.39951468243
8.65237409454
11.3283654741
7.32911607833
9.79831526798
8.96394872704
11.0747225028
9.08073897044
9.38740959655
7.67198961561
13.5697332624
23.6225764247
8.90853390419
18.5335926666
11.7754765615
9.31807624818

From the Factor Analysis scores above, I can compute the Factor Analysis Loadings. These loadings represent the differential weighting to be used in the final computation of the ST Index.

In [7]:
Loading2007 = pd.DataFrame(fa2007.components_, columns=('ProtectArea', 'EPI', 'Satisfaction', 'Heritage', 'HealthCap', 'GenderIneq', 'GovPrior', 'StringEnv'))
Loading2007
print pd.DataFrame(-fa2007.components_, columns=('ProtectArea', 'EPI', 'Satisfaction', 'Heritage', 'HealthCap', 'GenderIneq', 'GovPrior', 'StringEnv'))
# gives the factor loadings

Loading2008 = pd.DataFrame(fa2008.components_, columns=('ProtectArea', 'EPI', 'Satisfaction', 'Heritage', 'HealthCap', 'GenderIneq', 'GovPrior', 'StringEnv', 'SustInd'))
Loading2008
print pd.DataFrame(-fa2008.components_, columns=('ProtectArea', 'EPI', 'Satisfaction', 'Heritage', 'HealthCap', 'GenderIneq', 'GovPrior', 'StringEnv', 'SustInd'))

Loading2009 = pd.DataFrame(fa2009.components_, columns=('ProtectArea', 'EPI', 'Satisfaction', 'Heritage', 'HealthCap', 'GenderIneq', 'GovPrior', 'StringEnv', 'SustInd'))
Loading2009
print pd.DataFrame(-fa2009.components_, columns=('ProtectArea', 'EPI', 'Satisfaction', 'Heritage', 'HealthCap', 'GenderIneq', 'GovPrior', 'StringEnv', 'SustInd'))

Loading2011 = pd.DataFrame(fa2011.components_, columns=('ProtectArea', 'EPI', 'Satisfaction', 'Heritage', 'HealthCap', 'GenderIneq', 'GovPrior', 'StringEnv', 'SustInd'))
Loading2011
print pd.DataFrame(-fa2011.components_, columns=('ProtectArea', 'EPI', 'Satisfaction', 'Heritage', 'HealthCap', 'GenderIneq', 'GovPrior', 'StringEnv', 'SustInd'))

Loading2013 = pd.DataFrame(fa2013.components_, columns=('EPI', 'Satisfaction', 'Heritage', 'HealthCap', 'GenderIneq', 'GovPrior', 'StringEnv', 'SustInd'))
Loading2013
print pd.DataFrame(-fa2013.components_, columns=('EPI', 'Satisfaction', 'Heritage', 'HealthCap', 'GenderIneq', 'GovPrior', 'StringEnv', 'SustInd'))

Loading2015 = pd.DataFrame(fa2015.components_, columns=('ProtectArea', 'EPI', 'Heritage', 'HealthCap', 'GenderIneq', 'GovPrior', 'StringEnv', 'SustInd'))
Loading2015
print pd.DataFrame(-fa2015.components_, columns=('ProtectArea', 'EPI', 'Heritage', 'HealthCap', 'GenderIneq', 'GovPrior', 'StringEnv', 'SustInd'))
   ProtectArea       EPI  Satisfaction  Heritage  HealthCap  GenderIneq  \
0     0.286224  0.986312      0.018201  0.331698   0.709648    0.018977   

   GovPrior  StringEnv  
0   0.01186   0.777657  
   ProtectArea       EPI  Satisfaction  Heritage  HealthCap  GenderIneq  \
0     -0.08715  0.243535       0.68323  0.180948  -0.064265     0.04019   

   GovPrior  StringEnv   SustInd  
0  0.878366   0.552252  0.992939  
   ProtectArea       EPI  Satisfaction  Heritage  HealthCap  GenderIneq  \
0    -0.153331  0.162599      0.746392  0.164256  -0.042939    0.028881   

   GovPrior  StringEnv   SustInd  
0  0.941884   0.444508  0.973355  
   ProtectArea       EPI  Satisfaction  Heritage  HealthCap  GenderIneq  \
0    -0.155709  0.349313      0.667062  0.211852   0.069243    0.050142   

   GovPrior  StringEnv   SustInd  
0  0.828213   0.604534  0.970997  
        EPI  Satisfaction  Heritage  HealthCap  GenderIneq  GovPrior  \
0  0.381605      0.571026  0.237333   0.166761    0.167314  0.777796   

   StringEnv   SustInd  
0   0.685753  0.970045  
   ProtectArea       EPI  Heritage  HealthCap  GenderIneq  GovPrior  \
0     0.339649  0.979355  0.231035   0.812514   -0.121435  0.230918   

   StringEnv   SustInd  
0   0.678779  0.345624  
In [4]:
import plotly.plotly as py
from plotly.tools import FigureFactory as FF
import plotly.tools as tls
tls.set_credentials_file(username='JoelleDuff', api_key='NkSk1R88B7vYzL33bkuk')
import pandas as pd

data_matrix = [['Year', 'ProtectArea', 'EPI', 'Satisfaction', 'Heritage', 'HealthCap', 'GenderIneq', 'GovPrior', 'StringEnv', 'SustInd'],
               ['2007', 0.286224, 0.986312, 0.018201, 0.331698, 0.709648, 0.018977, 0.01186, 0.777657, 'NaN'],
               ['2008', -0.08715, 0.243535, 0.68323, 0.180948, -0.064265, 0.04019, 0.878366, 0.552252, 0.992939],
               ['2009', -0.153331, 0.162599, 0.746392, 0.164256, -0.042939, 0.028881, 0.941884, 0.444508, 0.973355],
               ['2011', -0.155709, 0.349313, 0.667062, 0.211852, 0.069243, 0.050142, 0.828213, 0.604534, 0.970997],
               ['2013', 'NaN', 0.381605, 0.571026, 0.237333, 0.166761, 0.167314, 0.777796, 0.685753, 0.970045],
               ['2015', 0.339649, 0.979355, 'NaN', 0.231035, 0.812514, -0.121435, 0.230918, 0.678779, 0.345624]]

table = FF.create_table(data_matrix, index=True, index_title='Year')
py.iplot(table, filename='Loading Factors')
Out[4]:

The table above shows the loading factors clearly, for each year. To make it more clear which variables are most promising to include in the index, the following scatterplot is created:

In [5]:
import plotly.plotly as py
import plotly.graph_objs as go

import plotly.plotly as py
import plotly.graph_objs as go

trace2007 = go.Scatter(
    x=['ProtectArea', 'EPI', 'Satisfaction', 'Heritage', 'HealthCap', 'GenderIneq', 'GovPrior', 'StringEnv', 'SustInd'],
    y=[0.286224, 0.986312, 0.018201, 0.331698, 0.709648, 0.018977, 0.01186, 0.777657, 'NaN'],
    text=['ProtectArea', 'EPI', 'Satisfaction', 'Heritage', 'HealthCap', 'GenderIneq', 'GovPrior', 'StringEnv', 'SustInd'],
    mode='markers',
    name=2007,
    marker=dict(
        size=[100,100,100,100,100,100,100,100,100],
        sizemode='area',
    )
)
trace2008 = go.Scatter(
    x=['ProtectArea', 'EPI', 'Satisfaction', 'Heritage', 'HealthCap', 'GenderIneq', 'GovPrior', 'StringEnv', 'SustInd'],
    y=[-0.08715, 0.243535, 0.68323, 0.180948, -0.064265, 0.04019, 0.878366, 0.552252, 0.992939],
    text=['ProtectArea', 'EPI', 'Satisfaction', 'Heritage', 'HealthCap', 'GenderIneq', 'GovPrior', 'StringEnv', 'SustInd'],
    mode='markers',
    name=2008,
    marker=dict(
        size=[100,100,100,100,100,100,100,100,100],
        sizemode='area',
    )
)
trace2009 = go.Scatter(
    x=['ProtectArea', 'EPI', 'Satisfaction', 'Heritage', 'HealthCap', 'GenderIneq', 'GovPrior', 'StringEnv', 'SustInd'],
    y=[-0.153331, 0.162599, 0.746392, 0.164256, -0.042939, 0.028881, 0.941884, 0.444508, 0.444508],
    text=['ProtectArea', 'EPI', 'Satisfaction', 'Heritage', 'HealthCap', 'GenderIneq', 'GovPrior', 'StringEnv', 'SustInd'],
    mode='markers',
    name=2009,
    marker=dict(
        size=[100,100,100,100,100,100,100,100,100],
        sizemode='area',
    )
)
trace2011= go.Scatter(
    x=['ProtectArea', 'EPI', 'Satisfaction', 'Heritage', 'HealthCap', 'GenderIneq', 'GovPrior', 'StringEnv', 'SustInd'],
    y=[-0.155709, 0.349313, 0.667062, 0.211852, 0.069243, 0.050142, 0.828213, 0.604534, 0.970997],
    text=['ProtectArea', 'EPI', 'Satisfaction', 'Heritage', 'HealthCap', 'GenderIneq', 'GovPrior', 'StringEnv', 'SustInd'],
    mode='markers',
    name=2011,
    marker=dict(
        size=[100,100,100,100,100,100,100,100,100],
        sizemode='area',
    )
)
trace2013= go.Scatter(
    x=['ProtectArea', 'EPI', 'Satisfaction', 'Heritage', 'HealthCap', 'GenderIneq', 'GovPrior', 'StringEnv', 'SustInd'],
    y=['NaN', 0.381605, 0.571026, 0.237333, 0.166761, 0.167314, 0.777796, 0.685753, 0.970045],
    text=['ProtectArea', 'EPI', 'Satisfaction', 'Heritage', 'HealthCap', 'GenderIneq', 'GovPrior', 'StringEnv', 'SustInd'],
    mode='markers',
    name=2013,
    marker=dict(
        size=[100,100,100,100,100,100,100,100,100],
        sizemode='area',
    )
)
trace2015= go.Scatter(
    x=['ProtectArea', 'EPI', 'Satisfaction', 'Heritage', 'HealthCap', 'GenderIneq', 'GovPrior', 'StringEnv', 'SustInd'],
    y=[0.339649, 0.979355, 'NaN', 0.231035, 0.812514, -0.121435, 0.230918, 0.678779, 0.345624],
    text=['ProtectArea', 'EPI', 'Satisfaction', 'Heritage', 'HealthCap', 'GenderIneq', 'GovPrior', 'StringEnv', 'SustInd'],
    mode='markers',
    name=2015,
    marker=dict(
        size=[100,100,100,100,100,100,100,100,100],
        sizemode='area',
    )
)
data = [trace2007, trace2008, trace2009, trace2011, trace2013, trace2015]
py.iplot(data, filename='bubblechart-size-ref')
Out[5]:

This scatterplot shows the distribution of the loadings for each variable and for each year. A method often used to choose which variables will be included, is to chose those that are greater than $0.3$. Furthermore, from this scatterplot we see that several variables are particularly important within the factor, as they return each year. Satisfaction, GovPrior, StringEnv, and SustInd are often above $0.5$.

Sustainability Index Scores Calculation

Now we will construct the Sustainability Index scores, based on the loading factors given earlier. The method is repeated each year, using the corresponding loading factors in the table below.

In [6]:
data_loadings = [['Indicator', '2007', 'Indicator', '2008', 'Indicator', '2009', 'Indicator', '2011', 'Indicator', '2013', 'Indicator', '2015'],
               ['EPI', 0.986, 'SustInd', 0.993, 'SustInd', 0.973, 'SustInd', 0.971, 'SustInd', 0.970, 'EPI', 0.979],
               ['StringEnv', 0.778, 'GovPrior', 0.878, 'GovPrior', 0.942, 'GovPrior', 0.828, 'GovPrior', 0.778, 'HealthCap', 0.812],
               ['HealthCap', 0.710, 'Satisfaction', 0.683, 'Satisfaction', 0.746, 'Satisfaction', 0.667, 'StringEnv', 0.686, 'StringEnv', 0.679],
               ['Heritage', 0.332, 'StringEnv', 0.552, 'StringEnv', 0.445, 'StringEnv', 0.605, 'Satisfaction', 0.571, 'SustInd', 0.346],
                ['-', '-', '-', '-', '-', '-', 'EPI', 0.349, 'EPI', 0.382, 'ProtectArea', 0.340]]

table2 = FF.create_table(data_loadings)
py.iplot(table2, filename='LoadingFactorsReduced')
Out[6]:

These are the corresponding function for each Sustainable Tourism Index:

$\ $

$$Sustainable \ Tourism \ Index \ (2007) = 0,986 \ EPI + 0,778 \ StringEnv + 0,71 \ HealthCap + 0,332 \ Heritage$$
$$Sustainable \ Tourism \ Index \ (2008) = 0,993 \ SustInd + 0,878 \ GovPrior + 0,683 \ Satisfaction + 0,552 \ StringEnv$$
$$Sustainable \ Tourism \ Index \ (2009) = 0,973 \ SustInd + 0,942 \ GovPrior + 0,746 \ Satisfaction + 0,445 \ StringEnv$$
$$Sustainable \ Tourism \ Index \ (2011) = 0,971 \ SustInd + 0,828 \ GovPrior + 0,667 \ Satisfaction + 0,605 \ StringEnv + 0,349 \ EPI$$

$$Sustainable \ Tourism \ Index \ (2013) = 0,97 \ SustInd + 0,778 \ GovPrior + 0,686 \ StringEnv + 0,571\ Satisfaction + 0,382\ EPI$$
$$Sustainable\ Tourism\ Index \ (2015) = 0,979\ EPI + 0,812\ HealthCap + 0,679\ StringEnv + 0,346\ SustInd + 0,34\ ProtectArea$$

Below I have created matrices for the loading factors and datasets, such that $STVar2007$ represents the combination of $MatrixST2007$ (loading factors) and $MatrixVar2007$ (variables of dataset which are included). Then, the seperate columns of $STVar2007$ are summed together to create the ST index, $ST2007$. This method is used for each period.

In [11]:
MatrixST2007=[0.986, 0.778, 0.71, 0.332] # (1,4) matrix of loading factors
MatrixVar2007=Sustain2007[['EPI', 'StringEnv', 'HealthCap', 'Heritage']].values # (144,4) matrix of variables per country
STVar2007=MatrixST2007*MatrixVar2007 # combination of previous matrices
ST2007=np.sum([STVar2007], axis=2) 
ST2007=zip([ST2007])
ST2007=np.reshape(ST2007, (144,1))# sum seperate colums of new (144,4) matrix to create (114,1) matrix, where the new column shows the scores. 

MatrixST2008=[0.993, 0.878, 0.683, 0.552]
MatrixVar2008=Sustain2008[['SustInd', 'GovPrior', 'Satisfaction', 'StringEnv']].values
STVar2008=MatrixST2008*MatrixVar2008
ST2008=np.sum([STVar2008], axis=2)
ST2008=zip([ST2008])
ST2008=np.reshape(ST2008, (144,1))

MatrixST2009=[0.973, 0.942, 0.746, 0.445]
MatrixVar2009=Sustain2009[['SustInd', 'GovPrior', 'Satisfaction', 'StringEnv']].values
STVar2009=MatrixST2009*MatrixVar2009
ST2009=np.sum([STVar2009], axis=2)
ST2009=zip([ST2009])
ST2009=np.reshape(ST2009, (144,1))

MatrixST2011=[0.971, 0.828, 0.667, 0.605, 0.349]
MatrixVar2011=Sustain2011[['SustInd', 'GovPrior', 'Satisfaction', 'StringEnv', 'EPI']].values
STVar2011=MatrixST2011*MatrixVar2011
ST2011=np.sum([STVar2011], axis=2)
ST2011=zip([ST2011])
ST2011=np.reshape(ST2011, (144,1))

MatrixST2013=[0.97, 0.778, 0.686, 0.571, 0.382]
MatrixVar2013=Sustain2013[['SustInd', 'GovPrior', 'StringEnv', 'Satisfaction', 'EPI']].values
STVar2013=MatrixST2013*MatrixVar2013
ST2013=np.sum([STVar2013], axis=2)
ST2013=zip([ST2013])
ST2013=np.reshape(ST2013, (144,1))

MatrixST2015=[0.979, 0.812, 0.679, 0.346, 0.34]
MatrixVar2015=Sustain2015[['EPI', 'HealthCap', 'StringEnv', 'SustInd', 'ProtectArea']].values
STVar2015=MatrixST2015*MatrixVar2015
ST2015=np.sum([STVar2015], axis=2)
ST2015=zip([ST2015])
ST2015=np.reshape(ST2015, (144,1))

ST=np.concatenate([ST2007,ST2008], axis=1) # Combine all ST index scores into one matrix containing all years
ST=np.concatenate([ST,ST2009], axis=1)
ST=np.concatenate([ST,ST2011], axis=1)
ST=np.concatenate([ST,ST2013], axis=1)
ST=np.concatenate([ST,ST2015], axis=1)
print ST 
[[  55.7045       13.22739      14.6333       32.94164      40.22614
    78.9648732 ]
 [  54.7829       14.20833      13.56946      30.78834      36.61203
    76.9148061 ]
 [          nan           nan           nan   20.30713              nan
    56.50456498]
 [  54.78108      14.23779      14.68431      30.77933      43.77547
    86.5492211 ]
 [  63.64588      14.74822      15.23773      35.12867      43.05613
    88.01955651]
 [  91.97812      18.30376      18.87045      46.12517      50.47845
    99.04097894]
 [  86.1268       19.1062       19.43592      46.94045      50.74361
   102.00062016]
 [  59.63164      14.49796      16.64417      34.26006      46.78999
    91.47413836]
 [  55.61964      15.65555      16.90264      35.11366      42.49414
    76.89681159]
 [  28.48002      12.49383      13.23652      21.38548      28.05586
    47.65639557]
 [          nan   18.31559      19.27034      34.19201      38.69009
    61.99439818]
 [  75.91606      16.86513      17.51757      39.63014      46.34838
    94.40996733]
 [          nan   15.16935      15.72526      25.96683      27.35169
            nan]
 [  54.22934      11.08499      11.39608      29.15868      37.51735
    81.01440612]
 [  46.78012      11.81455      13.00665      27.09735      34.97907
            nan]
 [  48.77496      16.38452      17.85683      32.8061       42.64949
    86.98801093]
 [  61.43014      14.418        15.04318      33.63619      45.07633
    93.72993612]
 [          nan           nan   16.63813      39.19499      40.41449
            nan]
 [  69.0878       14.21704      14.69625      34.91163      43.34511
    99.54560122]
 [          nan   16.45208      16.8069       29.95679      30.58345
    51.76264146]
 [          nan   11.91487      12.78301      20.38951      27.02672
    46.62203646]
 [          nan           nan           nan   30.65168      33.9592
    55.18572732]
 [  36.89402      16.1013       15.82499      27.94465      33.04257
    62.260014  ]
 [  37.53816      12.32929      13.61701      25.78848      34.28986
    63.70159486]
 [  81.74524      17.60535      18.22215      42.60556      48.95686
    92.51619279]
 [          nan   12.04954      13.20896      24.00178      26.05933
    46.67224554]
 [  71.98766      14.88431      15.41869      38.96252      44.54166
    87.45086378]
 [  56.23236      14.76647      16.04104      30.70882      39.24346
    75.75553031]
 [  55.86804      15.73591      16.82574      32.67684      43.78937
    88.51200636]
 [  62.71106      16.95145      18.60264      37.99896      47.96379
    93.23789441]
 [          nan           nan   11.5621       25.01358      34.06577
    69.86134206]
 [  66.87296      16.39945      16.84176      37.47974      46.60803
    93.89265946]
 [  70.39626      17.07069      17.87279      39.76954      46.15053
    90.52115397]
 [  89.73646      15.7978       16.93379      44.04196      46.92382
    97.44062415]
 [  83.41836      17.91572      18.18948      43.73353      48.72506
   103.53978336]
 [  57.00602      15.97006      17.03814      34.53045      43.78954
    87.80999505]
 [  61.02652      13.04666      13.80388      34.77943      40.33292
            nan]
 [  64.52688      16.17625      17.07665      37.15175      39.5789
    73.56872374]
 [  46.12212      15.42131      16.57456      29.42237      39.56188
    75.95751294]
 [  79.37528      16.57035      17.29444      42.35557      49.9174
   102.25631275]
 [          nan   13.62702      15.19817      28.38076      30.44958
    55.16207535]
 [  83.96228      16.97619      17.61252      43.35124      51.84122
   102.42914712]
 [  85.79794      17.16493      18.4015       42.4713       49.67478
   103.5453225 ]
 [  32.0104       17.70376      18.9574       27.82887      36.37529
    58.03003849]
 [  51.44262      15.65241      16.8993       31.55031      39.07585
    71.13449264]
 [  96.20376      17.11243      16.77266      44.63829      47.99195
   107.5177977 ]
 [          nan           nan   16.22952      25.53627      35.336
    66.38886581]
 [  82.5192       17.22483      17.53996      41.01473      45.95902
    98.13983588]
 [  50.1785       15.35697      16.67803      29.72589      38.71768
    82.63437088]
 [          nan   14.93222      15.5847       27.93183      41.28763
    74.87035696]
 [          nan           nan           nan           nan   27.04817
    45.24642481]
 [  50.58114      16.00239      16.98128      31.45147      39.72817
    78.19121441]
 [          nan   18.48246      18.63635              nan           nan
            nan]
 [  76.46836      15.95295      15.77623      38.91444      45.84628
    95.48654088]
 [  80.74016      17.49953      18.18359      44.21847      51.22289
   101.22373148]
 [  42.12914      16.41931      16.82549      25.55357      35.15834
    60.28729217]
 [  47.26612      16.23109      16.8187       30.60221      38.41938
    72.09126709]
 [          nan           nan           nan   30.08654      37.15678
    70.12093143]
 [  79.76898      18.08454      18.80264      43.68713      49.05666
    97.75987563]
 [  74.04702      16.58602      16.60345      38.473        45.90637
    88.55394831]
 [  93.619        14.52629      14.67095      40.26075      45.93251
    96.34535024]
 [          nan   16.54112      17.65976      36.91445      45.14332
    82.46742485]
 [  80.6565       16.11391      16.60635      41.59357      47.1034
    90.96608295]
 [  61.12192      16.42388      17.3397       35.45898      41.4663
            nan]
 [  56.5447       14.05447      14.49556      31.00779      41.07244
    79.44688973]
 [  39.51062      17.02031      18.59748      29.15776      39.11501
    70.07032302]
 [  67.81584      16.45559      16.12186      35.92005      41.28466
    77.59842037]
 [          nan   11.25157      11.05387      33.35187      33.27376
    70.25127536]
 [          nan   13.99595      14.46309      26.35536      38.8316
    78.50323779]
 [  67.67458      14.51901      15.2932       35.77324      46.10604
    97.33686885]
 [          nan           nan           nan   31.13886      37.98571
    70.28062338]
 [          nan   13.38274      15.34923      20.7971       28.99613
    50.30564746]
 [          nan   13.45896      15.22052      26.97715              nan
            nan]
 [  65.8507       14.35283      14.83834      34.91454      44.99889
    97.01799806]
 [  88.89088      16.5253       16.48356      46.35988      49.57039
   106.05844565]
 [  51.94162      13.01207      14.74196      31.46762      41.69921
    82.9423363 ]
 [  27.36044      16.08511      17.35469      24.16351      26.03235
    41.09510785]
 [          nan           nan   16.24574      28.94564      30.55656
    58.02338763]
 [  64.16832      18.68912      17.80107      38.11455      46.89056
    85.78001856]
 [  21.50734      17.14032      18.32399      22.49874      30.55058
    46.51414141]
 [  72.56032      16.75742      17.73892      40.51391      49.92898
    96.24366283]
 [  29.08076      14.56009      15.71686      23.25381      27.95139
    49.13741038]
 [  60.61218      18.4931       18.83241      38.01705      43.6349
    76.45688956]
 [  65.68606      16.55431      17.18223      34.9055       43.42732
    82.67976806]
 [  55.90124      13.25738      13.52429      30.94181      39.49957
    81.02997879]
 [  47.56508      12.90809      12.7076       27.6949       36.22269
    75.26178181]
 [          nan   17.03061      17.72954      36.39114      45.34966
    86.29909146]
 [  55.89398      17.15546      17.67568      35.11451      43.87134
    84.41519412]
 [          nan   15.23514      16.29861      26.13767      28.65057
    50.1590437 ]
 [          nan   15.96824      18.22114      32.63624      42.6192
    89.67680178]
 [  39.33066      15.22309      16.07874      27.42979      32.81233
    59.26237249]
 [  83.9119       17.18389      17.9229       43.74446      47.22581
    99.98418236]
 [  82.7061       19.13301      19.66599      46.20838      53.38758
   103.2855246 ]
 [  51.56196      14.88255      15.09406      30.15965      38.06929
    78.37469425]
 [  40.76774      14.87427      15.42628      26.80716      33.17451
    65.33831106]
 [  86.09776      17.28051      17.97044      43.94978      49.59584
    98.76176093]
 [          nan   15.27643      17.49704      34.88418      40.52206
    69.28240315]
 [  38.7306       13.73919      13.57252      23.20253      30.07637
    59.04240893]
 [  59.69974      15.69681      16.84403      35.75124      45.25608
    88.24182481]
 [          nan   10.93863      12.09307      25.07624      38.53842
    75.20346944]
 [  47.633        14.99196      16.39919      31.18377      42.24817
    82.87222268]
 [  48.18544      15.39576      16.61514      29.62989      42.49037
    79.62625692]
 [  73.12712      12.60608      13.17117      37.11815      43.0785
    96.45644042]
 [  80.14892      18.02338      18.55179      43.85432      50.48017
   100.04324299]
 [          nan   17.78562      18.44594              nan           nan
            nan]
 [          nan   17.51603      17.90844      39.53322      42.79289
    80.56159428]
 [  54.30574      12.79749      13.86219      30.85084      41.82611
    92.58179382]
 [  63.83266      11.64719      12.40637      30.90647      42.54884
    92.42751858]
 [          nan           nan           nan   32.1791       37.28513
    59.64508105]
 [          nan   14.62932      15.89944      38.77963      41.44159
    84.07945653]
 [          nan   15.83003      16.8402       29.46911      38.76303
    75.4808586 ]
 [  71.57766      14.16973      14.64984      37.70008      40.48488
    82.71977942]
 [          nan           nan           nan           nan   43.79437
    70.58879082]
 [          nan           nan           nan           nan   29.1628
    51.51780018]
 [          nan   19.60847      20.05713      47.77074      52.65162
    94.22029568]
 [  80.21966      14.11536      14.26074      39.12642      44.08879
   101.07261489]
 [  79.37778      15.88085      16.26328      42.07771      46.79911
   110.48930923]
 [  56.37996      17.69528      18.34241      35.50215      42.2568
    77.25022229]
 [  96.98072      16.96456      18.14041      44.85851      50.23582
   103.2378517 ]
 [  56.50944      16.6123       17.36811      35.18339      40.82456
    75.78748591]
 [  54.6928       12.19514      13.52559              nan   39.91589
    77.88942043]
 [          nan           nan           nan   28.55185      36.82465
    64.87512144]
 [  88.43342      17.49914      18.43893      45.34532      50.95157
   100.94251323]
 [  95.08134      18.11432      18.77713      49.72224      51.60044
   102.28453188]
 [          nan   15.63384      16.45323      33.52949              nan
            nan]
 [          nan   16.62416      17.24794      37.98363      43.35485
    84.07529065]
 [          nan   15.25728      15.90356      25.47257      41.91257
    78.39823079]
 [          nan   16.83382      17.09207      26.89286      35.37109
    71.50695448]
 [  56.58004      17.77938      18.40209      35.28787      42.38168
    77.143148  ]
 [          nan           nan           nan   26.07888              nan
            nan]
 [          nan   13.37196      14.15596      31.61733      41.23559
    82.48017571]
 [  64.6647       18.60084      19.36051      38.92565              nan
    81.95696709]
 [  58.74176      16.24207      17.01869      34.28039      40.01508
    71.29170012]
 [  41.14076      16.21647      16.74992      28.40927      35.1432
    63.91316476]
 [  52.15804      12.16699      13.18765      28.43684      40.92148
            nan]
 [          nan   18.91503      19.82615      44.24602      45.22125
    82.23463261]
 [  90.59648      16.94227      17.04695      43.27964      50.48223
   103.0901945 ]
 [  78.32522      16.30121      16.98691      39.79858      47.98504
    95.24414288]
 [  56.907        16.44894      17.40534      35.63349      44.6632
    80.78684624]
 [  61.79258      10.67432      10.89981      29.35246      38.30114
    97.31354903]
 [  39.98412      15.92858      16.76127      27.94975      33.99981
    62.67555693]
 [          nan           nan           nan           nan   28.69509
    49.83446961]
 [  42.85288      16.89582      17.66268      30.33278      39.74019
    82.26021734]
 [  52.00198      12.70747      13.60693      31.50311      36.25833
    71.52570517]]

Sustainable Tourism Index Result

The table printed below, shows the final ST Index scores per year, per country

In [11]:
STFull = pd.ExcelFile("C:/Users/asus/Documents/Master/AEA I/ST Score Python.xlsx")
pd.to_numeric(STFull, errors='coerce')
STFull
STFull.sheet_names
STFull = STFull.parse('Blad1')
STFull
Out[11]:
Countries 2007 2008 2009 2011 2013 2015
0 Albania 55.70450 13.22739 14.63330 32.94164 40.22614 78.964873
1 Algeria 54.78290 14.20833 13.56946 30.78834 36.61203 76.914806
2 Angola NaN NaN NaN 20.30713 NaN 56.504565
3 Argentina 54.78108 14.23779 14.68431 30.77933 43.77547 86.549221
4 Armenia 63.64588 14.74822 15.23773 35.12867 43.05613 88.019557
5 Australia 91.97812 18.30376 18.87045 46.12517 50.47845 99.040979
6 Austria 86.12680 19.10620 19.43592 46.94045 50.74361 102.000620
7 Azerbaijan 59.63164 14.49796 16.64417 34.26006 46.78999 91.474138
8 Bahrain 55.61964 15.65555 16.90264 35.11366 42.49414 76.896812
9 Bangladesh 28.48002 12.49383 13.23652 21.38548 28.05586 47.656396
10 Barbados NaN 18.31559 19.27034 34.19201 38.69009 61.994398
11 Belgium 75.91606 16.86513 17.51757 39.63014 46.34838 94.409967
12 Benin NaN 15.16935 15.72526 25.96683 27.35169 NaN
13 Bolivia 54.22934 11.08499 11.39608 29.15868 37.51735 81.014406
14 Bosnia and Herzegovina 46.78012 11.81455 13.00665 27.09735 34.97907 NaN
15 Botswana 48.77496 16.38452 17.85683 32.80610 42.64949 86.988011
16 Brazil 61.43014 14.41800 15.04318 33.63619 45.07633 93.729936
17 Brunei Darussalam NaN NaN 16.63813 39.19499 40.41449 NaN
18 Bulgaria 69.08780 14.21704 14.69625 34.91163 43.34511 99.545601
19 Burkina Faso NaN 16.45208 16.80690 29.95679 30.58345 51.762641
20 Burundi NaN 11.91487 12.78301 20.38951 27.02672 46.622036
21 Cabo Verde NaN NaN NaN 30.65168 33.95920 55.185727
22 Cambodia 36.89402 16.10130 15.82499 27.94465 33.04257 62.260014
23 Cameroon 37.53816 12.32929 13.61701 25.78848 34.28986 63.701595
24 Canada 81.74524 17.60535 18.22215 42.60556 48.95686 92.516193
25 Chad NaN 12.04954 13.20896 24.00178 26.05933 46.672246
26 Chile 71.98766 14.88431 15.41869 38.96252 44.54166 87.450864
27 China 56.23236 14.76647 16.04104 30.70882 39.24346 75.755530
28 Colombia 55.86804 15.73591 16.82574 32.67684 43.78937 88.512006
29 Costa Rica 62.71106 16.95145 18.60264 37.99896 47.96379 93.237894
... ... ... ... ... ... ... ...
114 Singapore NaN 19.60847 20.05713 47.77074 52.65162 94.220296
115 Slovak Republic 80.21966 14.11536 14.26074 39.12642 44.08879 101.072615
116 Slovenia 79.37778 15.88085 16.26328 42.07771 46.79911 110.489309
117 South Africa 56.37996 17.69528 18.34241 35.50215 42.25680 77.250222
118 Spain 96.98072 16.96456 18.14041 44.85851 50.23582 103.237852
119 Sri Lanka 56.50944 16.61230 17.36811 35.18339 40.82456 75.787486
120 Suriname 54.69280 12.19514 13.52559 NaN 39.91589 77.889420
121 Swaziland NaN NaN NaN 28.55185 36.82465 64.875121
122 Sweden 88.43342 17.49914 18.43893 45.34532 50.95157 100.942513
123 Switzerland 95.08134 18.11432 18.77713 49.72224 51.60044 102.284532
124 Syrian Arab Republic NaN 15.63384 16.45323 33.52949 NaN NaN
125 Taiwan* NaN 16.62416 17.24794 37.98363 43.35485 84.075291
126 Tajikistan NaN 15.25728 15.90356 25.47257 41.91257 78.398231
127 Tanzania NaN 16.83382 17.09207 26.89286 35.37109 71.506954
128 Thailand 56.58004 17.77938 18.40209 35.28787 42.38168 77.143148
129 Timor-Leste NaN NaN NaN 26.07888 NaN NaN
130 Trinidad and Tobago NaN 13.37196 14.15596 31.61733 41.23559 82.480176
131 Tunisia 64.66470 18.60084 19.36051 38.92565 NaN 81.956967
132 Turkey 58.74176 16.24207 17.01869 34.28039 40.01508 71.291700
133 Uganda 41.14076 16.21647 16.74992 28.40927 35.14320 63.913165
134 Ukraine 52.15804 12.16699 13.18765 28.43684 40.92148 NaN
135 United Arab Emirates NaN 18.91503 19.82615 44.24602 45.22125 82.234633
136 United Kingdom 90.59648 16.94227 17.04695 43.27964 50.48223 103.090195
137 United States 78.32522 16.30121 16.98691 39.79858 47.98504 95.244143
138 Uruguay 56.90700 16.44894 17.40534 35.63349 44.66320 80.786846
139 Venezuela, RB 61.79258 10.67432 10.89981 29.35246 38.30114 97.313549
140 Vietnam 39.98412 15.92858 16.76127 27.94975 33.99981 62.675557
141 Yemen, Rep. NaN NaN NaN NaN 28.69509 49.834470
142 Zambia 42.85288 16.89582 17.66268 30.33278 39.74019 82.260217
143 Zimbabwe 52.00198 12.70747 13.60693 31.50311 36.25833 71.525705

144 rows × 7 columns

In [183]:
print STFull
                  Countries      2007      2008      2009      2011      2013  \
0                   Albania  55.70450  13.22739  14.63330  32.94164  40.22614   
1                   Algeria  54.78290  14.20833  13.56946  30.78834  36.61203   
2                    Angola       NaN       NaN       NaN  20.30713       NaN   
3                 Argentina  54.78108  14.23779  14.68431  30.77933  43.77547   
4                   Armenia  63.64588  14.74822  15.23773  35.12867  43.05613   
5                 Australia  91.97812  18.30376  18.87045  46.12517  50.47845   
6                   Austria  86.12680  19.10620  19.43592  46.94045  50.74361   
7                Azerbaijan  59.63164  14.49796  16.64417  34.26006  46.78999   
8                   Bahrain  55.61964  15.65555  16.90264  35.11366  42.49414   
9                Bangladesh  28.48002  12.49383  13.23652  21.38548  28.05586   
10                 Barbados       NaN  18.31559  19.27034  34.19201  38.69009   
11                  Belgium  75.91606  16.86513  17.51757  39.63014  46.34838   
12                    Benin       NaN  15.16935  15.72526  25.96683  27.35169   
13                  Bolivia  54.22934  11.08499  11.39608  29.15868  37.51735   
14   Bosnia and Herzegovina  46.78012  11.81455  13.00665  27.09735  34.97907   
15                 Botswana  48.77496  16.38452  17.85683  32.80610  42.64949   
16                   Brazil  61.43014  14.41800  15.04318  33.63619  45.07633   
17        Brunei Darussalam       NaN       NaN  16.63813  39.19499  40.41449   
18                 Bulgaria  69.08780  14.21704  14.69625  34.91163  43.34511   
19             Burkina Faso       NaN  16.45208  16.80690  29.95679  30.58345   
20                  Burundi       NaN  11.91487  12.78301  20.38951  27.02672   
21               Cabo Verde       NaN       NaN       NaN  30.65168  33.95920   
22                 Cambodia  36.89402  16.10130  15.82499  27.94465  33.04257   
23                 Cameroon  37.53816  12.32929  13.61701  25.78848  34.28986   
24                   Canada  81.74524  17.60535  18.22215  42.60556  48.95686   
25                     Chad       NaN  12.04954  13.20896  24.00178  26.05933   
26                    Chile  71.98766  14.88431  15.41869  38.96252  44.54166   
27                    China  56.23236  14.76647  16.04104  30.70882  39.24346   
28                 Colombia  55.86804  15.73591  16.82574  32.67684  43.78937   
29               Costa Rica  62.71106  16.95145  18.60264  37.99896  47.96379   
..                      ...       ...       ...       ...       ...       ...   
114               Singapore       NaN  19.60847  20.05713  47.77074  52.65162   
115         Slovak Republic  80.21966  14.11536  14.26074  39.12642  44.08879   
116                Slovenia  79.37778  15.88085  16.26328  42.07771  46.79911   
117            South Africa  56.37996  17.69528  18.34241  35.50215  42.25680   
118                   Spain  96.98072  16.96456  18.14041  44.85851  50.23582   
119               Sri Lanka  56.50944  16.61230  17.36811  35.18339  40.82456   
120                Suriname  54.69280  12.19514  13.52559       NaN  39.91589   
121               Swaziland       NaN       NaN       NaN  28.55185  36.82465   
122                  Sweden  88.43342  17.49914  18.43893  45.34532  50.95157   
123             Switzerland  95.08134  18.11432  18.77713  49.72224  51.60044   
124    Syrian Arab Republic       NaN  15.63384  16.45323  33.52949       NaN   
125                 Taiwan*       NaN  16.62416  17.24794  37.98363  43.35485   
126              Tajikistan       NaN  15.25728  15.90356  25.47257  41.91257   
127                Tanzania       NaN  16.83382  17.09207  26.89286  35.37109   
128                Thailand  56.58004  17.77938  18.40209  35.28787  42.38168   
129             Timor-Leste       NaN       NaN       NaN  26.07888       NaN   
130     Trinidad and Tobago       NaN  13.37196  14.15596  31.61733  41.23559   
131                 Tunisia  64.66470  18.60084  19.36051  38.92565       NaN   
132                  Turkey  58.74176  16.24207  17.01869  34.28039  40.01508   
133                  Uganda  41.14076  16.21647  16.74992  28.40927  35.14320   
134                 Ukraine  52.15804  12.16699  13.18765  28.43684  40.92148   
135    United Arab Emirates       NaN  18.91503  19.82615  44.24602  45.22125   
136          United Kingdom  90.59648  16.94227  17.04695  43.27964  50.48223   
137           United States  78.32522  16.30121  16.98691  39.79858  47.98504   
138                 Uruguay  56.90700  16.44894  17.40534  35.63349  44.66320   
139           Venezuela, RB  61.79258  10.67432  10.89981  29.35246  38.30114   
140                 Vietnam  39.98412  15.92858  16.76127  27.94975  33.99981   
141             Yemen, Rep.       NaN       NaN       NaN       NaN  28.69509   
142                  Zambia  42.85288  16.89582  17.66268  30.33278  39.74019   
143                Zimbabwe  52.00198  12.70747  13.60693  31.50311  36.25833   

           2015  
0     78.964873  
1     76.914806  
2     56.504565  
3     86.549221  
4     88.019557  
5     99.040979  
6    102.000620  
7     91.474138  
8     76.896812  
9     47.656396  
10    61.994398  
11    94.409967  
12          NaN  
13    81.014406  
14          NaN  
15    86.988011  
16    93.729936  
17          NaN  
18    99.545601  
19    51.762641  
20    46.622036  
21    55.185727  
22    62.260014  
23    63.701595  
24    92.516193  
25    46.672246  
26    87.450864  
27    75.755530  
28    88.512006  
29    93.237894  
..          ...  
114   94.220296  
115  101.072615  
116  110.489309  
117   77.250222  
118  103.237852  
119   75.787486  
120   77.889420  
121   64.875121  
122  100.942513  
123  102.284532  
124         NaN  
125   84.075291  
126   78.398231  
127   71.506954  
128   77.143148  
129         NaN  
130   82.480176  
131   81.956967  
132   71.291700  
133   63.913165  
134         NaN  
135   82.234633  
136  103.090195  
137   95.244143  
138   80.786846  
139   97.313549  
140   62.675557  
141   49.834470  
142   82.260217  
143   71.525705  

[144 rows x 7 columns]

To show the results more clearly, I create a line graph below, which shows the trend in ST scores for several countries. Due to limitations in space, only the first few and last few countries in the dataset are shown below (Albania, Barbados, Burundi, Singapore, Sweden, and United Kingdom).

In [7]:
import plotly.plotly as py
import plotly.graph_objs as go

Albania = go.Scatter(
    x = ['2007', '2008', '2009', '2011', '2013', '2015'],
    y = [55.70450, 13.22739, 14.63330, 32.94164, 40.22614, 78.964873],
    text=['2007', '2008', '2009', '2011', '2013', '2015'],
    name='Albania',
    line=dict(
        shape='spline')
)
Barbados = go.Scatter(
    x = ['2007', '2008', '2009', '2011', '2013', '2015'],
    y = ['Nan', 18.31559, 19.27034, 34.19201, 38.69009, 61.994398],
    text=['2007', '2008', '2009', '2011', '2013', '2015'],
    name='Barbados',
    line=dict(
        shape='spline')
)
Burundi = go.Scatter(
    x = ['2007', '2008', '2009', '2011', '2013', '2015'],
    y = ['Nan', 11.91487, 12.78301, 20.38951, 27.02672, 46.622036],
    text=['2007', '2008', '2009', '2011', '2013', '2015'],
    name='Burundi',
    line=dict(
        shape='spline')
) 
Singapore = go.Scatter(
    x = ['2007', '2008', '2009', '2011', '2013', '2015'],
    y = ['Nan', 19.60847, 20.05713, 47.77074, 52.65162, 94.220296],
    text=['2007', '2008', '2009', '2011', '2013', '2015'],
    name='Singapore',
    line=dict(
        shape='spline')
)
Sweden = go.Scatter(
    x = ['2007', '2008', '2009', '2011', '2013', '2015'],
    y = [88.43342, 17.49914, 18.43893, 45.34532, 50.95157, 100.942513],
    text=['2007', '2008', '2009', '2011', '2013', '2015'],
    name='Sweden',
    line=dict(
        shape='spline')
)
UnitedKingdom = go.Scatter(
    x = ['2007', '2008', '2009', '2011', '2013', '2015'],
    y = [90.59648, 16.94227, 17.04695, 43.27964, 50.48223, 103.090195],
    text=['2007', '2008', '2009', '2011', '2013', '2015'],
    name='United Kingdom',
    line=dict(
        shape='spline')
)
data = [Albania, Barbados, Burundi, Singapore, Sweden, UnitedKingdom]
layout = dict(title = 'Sustainable Tourism Index Trend',
              xaxis = dict(title = 'Year'),
              yaxis = dict(title = 'ST Index (Score)'),
              )
fig = dict(data=data, layout=layout)
py.iplot(fig, filename='STindex scores')
Out[7]:

Sensitivity Analysis

To check if we get realistic and accurate results, I will compare these results with those I found using the statistical programme R. I performed the same anlysis in this programme earlier, and find that the loading factors resemble those calculated by R. Although the loadingfactors are not exactly the same, they are often very similar and in the same direction. This comparison helps support the results found in this python programming.

In [8]:
Rloadings = pd.ExcelFile("C:/Users/asus/Documents/Master/AEA I/Rdata.xlsx")
pd.to_numeric(Rloadings, errors='coerce')
Rloadings
Rloadings.sheet_names
Rloadings = Rloadings.parse('Blad2')
print Rloadings

data_matrix = [['Year', 'ProtectArea', 'EPI', 'Satisfaction', 'Heritage', 'HealthCap', 'GenderIneq', 'GovPrior', 'StringEnv', 'SustInd'],
               ['2007', 0.286224, 0.986312, 0.018201, 0.331698, 0.709648, 0.018977, 0.01186, 0.777657, 'NaN'],
               ['2008', -0.08715, 0.243535, 0.68323, 0.180948, -0.064265, 0.04019, 0.878366, 0.552252, 0.992939],
               ['2009', -0.153331, 0.162599, 0.746392, 0.164256, -0.042939, 0.028881, 0.941884, 0.444508, 0.973355],
               ['2011', -0.155709, 0.349313, 0.667062, 0.211852, 0.069243, 0.050142, 0.828213, 0.604534, 0.970997],
               ['2013', 'NaN', 0.381605, 0.571026, 0.237333, 0.166761, 0.167314, 0.777796, 0.685753, 0.970045],
               ['2015', 0.339649, 0.979355, 'NaN', 0.231035, 0.812514, -0.121435, 0.230918, 0.678779, 0.345624]]

table = FF.create_table(data_matrix, index=True, index_title='Year')
py.iplot(table, filename='Loading Factors')
      ProtectArea   EPI  Satisfaction  Heritage  HealthCap  GenderIneq  \
2007         0.27  0.99          0.09      0.29       0.69       -0.09   
2008         0.05  0.32          0.57      0.18       0.01        0.04   
2009        -0.03  0.27          0.70      0.17       0.06        0.01   
2011        -0.03  0.36          0.67      0.21       0.11        0.09   
2013          NaN  0.37          0.57      0.24       0.20        0.10   
2015         0.07  0.43           NaN      0.20       0.32        0.08   

      GovPrior  StringEnv  SustInd  
2007      0.14       0.78      NaN  
2008      0.85       0.59     1.00  
2009      0.92       0.48     0.95  
2011      0.87       0.64     0.96  
2013      0.77       0.74     0.95  
2015      0.84       0.75      NaN  
Out[8]:

Conclusion, Discussion, and Future Research

The results I got from performing this analysis in python give me similar results as in R. From this, I can conclude that Factor Analysis is a good method for creating a composite index with many underlying variables. The ST index seems to work very well for each year, and is well equipped to give quick insight into the sustainability of a county's tourism industry. So, to answer the research question.. Yes!

There are some caveats to my approach though, which would need to be considered for future research. Some datasets missed certain variables, which meant that the factor loadings differed each year. In particular, for 2007, 2013, and 2015 the datasets differed quite a lot, as did the ST index scores. Fortunately, during 2008, 2009, and 2011 the same variables were included in the ST index, making the indices of those years more comparable with eachtother. Unfortunately however, the scores of 2007, 2013, and 2015 included variables such as ProtectedArea, EPI, and GenderIneq, which, as mentioned earlier, had much higher values than the other indicators. For this reason, the ST index scores of 2007, 2013, and 2015 are much higher than those of the other years.

In the future it would be good to use only those variables for the index which are included in each dataset, and perhaps exclude certain years if they lack important variables. This way the indices over multiple years would be comparable, and important insights could be made. A second option would be to make a single ST index, which combines the individual ones by averaging the loading factors. This would make the scores even more comparable, since they would all be calculated in the same way.