FX Market and Machine Learning

Textual analysis of Donald Trump’s tweets and exchange rates fluctuations

Créé le

19.06.2020

Donald Trump’s tweeter account has been linked with several stock market variations, due to its surprising public announcements, frequent comments about the American foreign policy and huge number of publications. As the FX market’s fluctuations tend to be exacerbated by Geopolitical and Macroeconomic events, it might also show reactions to the America’s president tweets. Can sentiment indicators extracted from Donald Trump’s tweets help forecast exchange rates movements?

Textual analysis of Donald Trump’s tweets and exchange rates fluctuations

Garlone de Maleville

Advanced Master in Finance ESCP Europe

Foreign Exchange market fluctuations are known to be very hard to forecast, as they depend on many variables, coming from two different economic zones. However, a few fundamental parameters have been identified as having a measurable impact on exchange rates variations such as key rates, inflation rate, GDP or trade balance. Most of the market’s changes are known to be due to an unexpected change of the fundamentals or an expectation of the market participants. Those changes or sentiments are announced threw the media, public announcements and are even commented on social networks. Thus, despite their recent emergence, textual analysis methods happened to be efficient in extracting sentiment from a text and linking this indicator to the market’s fluctuations.

Textual Analysis refers to the analysis of a text’s structure and content by relating those to the historical and cultural context in which the text was produced. Textual analysis used to be performed before the emergence of the internet, but this method has been largely enhanced by the last decades technological breakthrough, enabling researchers to process a huge amount of data.

A large range of the last 3 years main economic and geopolitical events are involving the US and Trump administration’s foreign policy. Those events are often announced or quickly commented via the president’s twitter account. Moreover, markets tend to be very sensitive to the different trade tensions that have occurred since 2017. The study tries to see if a machine can make good investing decisions using textual analysis methods and having Donald Trump’s daily tweets as source of information.

The Foreign Exchange market

The Foreign Exchange market is the world’s largest financial market. Every day huge volumes are traded in this market. The 2019 triennial central bank survey states that Trading in FX markets reached $6.6 trillion per day in April 2019. In comparison, Nasdaq website announced a total traded volume of $128 billion for November 7 2019. It is known as efficient, decentralized and extremely liquid. As an OTC market, the FX market has no clearing house/regulating entity centralizing orders. Two counterparties can make a deal at any time and any place, because this market has no physical location. As information is decentralized, it makes it hard to gather all the trade orders and make a proper study of those information. This is why a great part of the studies concerning volatility and liquidity have been done with other markets, especially with the stock market. Indeed, it provides more easily information because trades are centralized. Moreover, this market is considered as the most efficient in the world. Efficient market hypothesis states that at any point in time the price of a security is a relevant measure of its intrinsic value. It makes it even more difficult to capture a change in the market value due to the speed at which it changes following news releases and macro events.

However, the choice of this asset class for the study looked interesting for the following reason: since the beginning of his mandate, Trump has been promoting brand new a foreign policy, completely different from his predecessors. This « America First » strategy rejects different ideas that used to be the main pillars of the American Foreign policy, such as promoting the democratic peace and the liberal strategy. As a result, Trump Administration has taken different measures for nearly 3 years that had a huge impact on America’s trade relationships. Since 2017 new tariffs have been announced and are now effective, which lead to trade tensions with China and Europe, the US’s two biggest competitors according to Trump. For some developing countries, sovereign economy has been seriously impacted by America’s protectionism such as Mexico since the beginning of Trump’s mandate and other trade tensions like in Turkey during summer 2018. All those new rules, fixed by America’s president have seriously impacted the concerned currency pairs, weakening many of them against the USD and increasing their volatility.

Even though the FX market is considered as the most liquid market in the world, currency pairs are not equally traded. Traded volumes can be extremely different from each other; thus, the liquidity and volatility are not the same. The 2019 Triennial central bank survey states that the most exchanged currency pairs concerning USD are the USD/EUR, USD/JPY and USD/GBP. Due to the lack of traded volumes and the political instabilities, emerging market currencies are known to be extremely volatiles. To have a representative sample of currency pairs, high volumes as low volumes need to be modelled. Moreover, most of the tensions of the last 3 years oppose the US with China and the EU. Other countries from the EM market have also been impacted by trade tensions such as Mexico and Turkey. Finally, the following currency pairs are going to be included in the study: USD/EUR, USD/GBP, USD/CNY and USD/TRY.

Brief review of Trump’s tweets content

Trump’s tweets are part of internet postings, a specific type of source that should be studied differently than earning press releases or news articles. Indeed, those represent a reliable source of information as the authors have a good knowledge of the market, whereas internet postings, coming from everyone, can provide wrong information. Moreover, press releases and news articles topics make it easier to find a sentiment about the market, the economic situation of a company/country or geopolitical and macroeconomic events. Internet postings provide much more data than the two previous sources but as conversations or posts are not specifically about finance, economy or geopolitics, the noise is very important. Antweiler and Frank (2004) analyzed the messages posted on Yahoo! Finance and Raging Bull, and they compared the extracted sentiment to the Dow Jones 45 companies and the Dow Jones Industrial Average. In this case, even though most of the conversations are about financial markets there are still irrelevant conversations that need to be filtered. Moreover, the content of each message, the vocabulary used, the abbreviations and the grammar are less professional than for the press releases and news articles.

Among the relevant posts taken from Trump’s tweeter account, different kinds of information can be captured. The first comes from Trump Administration’s measures that are directly posted and come as a “surprise event” to market participants. Those events broadly correspond to tariffs that have been enforced to Europe, China, Turkey, Mexico, Canada and others, during his mandate. Literature suggests that those events have a strong impact on the short run. Thus, a model that works on a daily basis should capture the changes before they fade. Another kind of information is the daily comments Trump makes about the economic and geopolitical situation. This information doesn’t necessarily constitute a “surprise event”, as he tweets about data the market already knows. However, this doesn’t mean those tweets have no impact over the USD fluctuations. If he often tweets about a specific theme, it is probable that more people will focus on this subject and make decisions out of it.

Moreover, the outcomes in the market are not expected to be the same as data extracted from the tweets differ from one another. A tweet about the last US data releases won’t have the same impact as a post about trade war with China. Indeed, macroeconomic and geopolitical information are not expected to have the same influence in the market. Brandt and Gao (2019) highlight those differences studying the impact macroeconomic and geopolitical news can have over the crude oil price. They came out with the following outcomes: Macroeconomic news tend to be more predictive in the long run, while geopolitical news have only an immediate impact on the oil price.

Some macroeconomic data tend to have more importance than others. Research papers trying to forecast the exchange rates movements are various and state that on the short term the most efficient parameters are nominal interest rate and inflation rate fluctuations. Thus, the central bank policies tend to have a direct and measurable impact on exchange rate movements. All those subjects have been often mentioned by Trump for 3 years. Capturing this information on his tweets would be a measure of a kind of pressure he might put on the FED before it cuts rates. It can as well be some commentaries on the current economic situation of the US or other countries.

Trump’s twitter account represents a large amount of data; indeed, he has tweeted more than 40,000 times for 10 years. Since he’s been elected in 2017, his comments on Twitter are more oriented on the US domestic policy and foreign policy. This is why the study focuses on the last three years, from 1st January 2017 to 19th September 2019. However, during this period they are still some posts considered as out-of-topic and they constitute the “noise” that needs to be filtered. Trump is a personality who tweets a lot, some days he can post up to 20 publications. It is then possible to study the reaction of the market on a daily basis. The choice has been made to gather all the tweets of one day and study their impact on the exchange rates returns and volatilities of the next day.

Textual analysis and machine learning

Textual analysis tools will provide a good way to extract information from Trump’s tweets. Those methods started being developed in accounting and finance very late, and it is still an emerging area. In the financial sector, researchers tend to look for evidence of a correlation between a sentiment extracted from a text and return or volatility of an asset’s value.

One method used to perform textual analysis is called the “word list” or dictionary-based method. It comes to create lists of words having specific meanings like a lexicon. Several lexicons are public and permit to avoid the potential bias of any research. Most of the available dictionaries are constituted by a positive and a negative word list, to permit models to measure the sentiment. Four dictionaries are available in the accounting and finance sector: Diction, Henry (2008), Harvard’s GI and Loughran and Mcdonald (2011). Other libraries and indicators are available. Python’s library, called Textblob is known as one of the simplest to use. It provides some functionalities to perform Natural Language Processing such as spelling corrections, noun phrase extraction, language translation, sentiment analysis, classification and others.

Textual analysis can as well be performed using Machine learning methods. Li (2010) uses machine learning to create a sentiment indicator. The sentiment of a given text has only 3 possible values: -1 for negative, 0 for neutral and 1 for positive. For a great part (usually 70%) of the sample called “training set”, the author will already give the answer to the algorithm. Then for the remaining set called “testing set” the algorithm has to find the answers by itself. The accuracy of those 2 sets give an overview on how well the algorithm performs. Accuracy represents the percentage of good predictions a model has given. The training set accuracy and the testing set accuracy can be compared respectively to the in-sample accuracy and the out-sample accuracy, terms that are more common in the financial literature.

A common way for the machine to classify given data is called decision tree classification. A method derived from the decision tree classification is the random forest classification. This method creates n number of different decision trees and the final classifier would be the average. This method fits very well with tabular data as inputs, for more complicated parameters such as images, different classifiers should be used. Other studies mention the inclusion of sentiment indicators in trading strategies. Tetlock et al. (2008) modelled an algorithm that shorts companies having negative sentiment news and longs the ones having positive sentiment news. The results of all those studies are contrasted: some make astonishing gains while others don’t.

In order to capture only the information that is directly linked with the 4 currency pairs, a specific small dictionary is created. This dictionary has to capture the following subjects: central bank’s monetary policies, trade wars, tariffs and inflation rate. As Trump is a public personality who needs to be understood by everyone, his vocabulary is not too specific. Thus a few words need to be integrated in the dictionary. Sentiment indicators are then going to be extracted from the tweets using Python Library Textblob. The sentiment analysis function of Textblob returns two features: The Polarity is a float which lies in the range of [-1,1]: 1 corresponds to a positive statement and -1 to a negative statement. The Subjectivity is a float which lies in the range of [0,1]: 0 corresponds to objective sentences (factual information) and 1 corresponds to subjective sentences (personal opinion). Each tweet can then have those two features, as the study focuses on daily returns, each day will have an average polarity and subjectivity. A trading algorithm is then going to be created, the aim is to link a poor/good return of one currency pair to a certain value of the two sentiment indicators and to the presence of specific words in the tweets of the previous day.

Design of the study

To get a measure of daily Trump’s tweets sentiment, five sentiment indicators have been used: maximal polarity, minimal polarity, average polarity, maximal subjectivity, average subjectivity. The maximum and minimum are used to detect the potential effect of one single tweet. As there can be 20 tweets per day, the most negative, positive or subjective tweet of the day might be weakened by the other 19 tweets. Five wordlists have then been created. The first one “Trade” list regroups all the words and subjects that are presumed to have an impact on currency exchange rates, such as “tariffs”, “rate” or “currency”. If the word count of this list is 0 for a given day, then both polarity and subjectivity indicators have to go back to 0. This method might erase a great part of the noise coming from out-of-topic tweets. The other four wordlists try to capture the tweets that are related to specific countries and economic zones: US, EU, China and Turkey.

The created parameters are compared to the returns and volatilities of different currencies using econometric tools. The aim of this part is to describe the parameters and targets, to study their correlations and to examine the strength and stability of a linear model created to fit and predict the targets. Correlation between different parameters and targets is tested. It measures the possible mutual relationship between two variables. A high (more than 50%) or moderate (between 25% and 50%) correlation is beneficial to the linear model. Eight linear models are then created and tested. Each model represents the different returns and volatilities at day t. They are assumed to depend on five sentiment indicators and five wordlist indicators at time t-1 following a linear equation.

The second part of the study is about the implementation of a trading algorithm, using the sentiment indicators, the wordlists and the currency pairs daily returns. Among the data collected from 1st January 2017 to 18th September 2019, corresponding to 710 days, 70% is randomly selected to be part of the training set and the 30% remaining are part of the testing set. The returns need first to be classified into 3 categories according to their value. For one given currency pair return, the highest values are classified in the BUY category and the lowest in the SELL category. The ones that are close to zero then go to the HOLD category. As the currency pairs are not the same liquidity and volatility, they can’t have the same thresholds when it comes to classify each value: 0.05% return for USDEUR has not the same impact as 0.05% return for USDTRY. This is why each one should be processed separately. As an example, for all USDCNY values from start date to end date, the 30% lowest are classified in the SELL category and USDCNY should be sold by the machine at their corresponding date. Three different samples or ways to arrange data have been tested (Table 1). Sample 1 classifies data approximately with the same percentage. Sample 2 tries to identify the extreme values have only 30% of data in the BUY and SELL categories. Sample 3 does the contrary, having only 10% of data in the HOLD section. Table 1 describes the full classification of the 3 samples. The model should be tested on each of them, it should also be compared to a model making random choices, long only and short only strategies. A Random forest classifier has been tested for the study.

Results Analysis

The econometric analysis shows that there is a correlation between the sentiment parameters and wordlists and the exchange rates returns/volatilities (Table 2). The highest score goes to the trade related vocabulary with USD/CNY return and volatility, reaching 20% correlation. As all the sentiment indicators are mixed with the count of this wordlist to erase the noise, polarity and subjectivity also have a good score with USD/CNY. This is not the case for the 3 other currency pairs, they have very low correlation with trade related vocabulary, dropping to almost zero percent for USD/GBP. Thus, this word list might need to be reviewed in order to capture more vocabulary linked to the trade relationships the US handle with the E.U., Great Britain and Turkey. Other wordlists also show some correlation with different currency pair parameters. Turkey related vocabulary is positively correlated to USD/TRY volatility with 20% correlation, China related vocabulary almost reached 14% correlation with USD/CNY return. But other parameters made very poor score and need to be reviewed. The EU related list didn’t get higher than 5% correlation with any currency pair. The sentiment indicators are also very low for most currency pairs, exceeding 10% only for USD/CNY values.

The rest of the econometric study showed that linear models are not strong or stable enough to be validated. As an example, the linear model of volatilities shows that 3 currency pairs have a breaking point, invalidating their stability. For USD/EUR and USD/CNY the breaking point is around mid-2019, when the trade war between the US, Europe and China was in full swing. For USD/TRY the breaking point is around mid-2018, when the US imposed new tariffs over Turkish goods that badly impacted the economy of Turkey. Thus, the period in which each currency pair have been studied needs to be reviewed and adapted to the situation of each country/economical zone.

However, the trading algorithm has been implemented and tested with the 4 currency pairs daily returns from 1st January 2017 to 18th September 2019. Table 4 shows the in-sample accuracy of the random forest classification algorithm. This trading algorithm has then been launched for a 2-month period, from 19th September 2019 to 26th November 2019. With a given USD 1,000 amount, the machine should take a new decision every day: buy, sell or do nothing. After two months the different gains are compared to each other. Scenarios are displayed in Table 3. The 3 samples are quite different in terms of gains and accuracy. Sample 2 shows no efficiency; indeed, all its results are close to zero gain. It also has the best in-sample accuracy, reaching 70%, which is 20 points higher than a random classification. However, those results can be explained by the fact that the algorithm always chooses the “Hold” category. Indeed, it has 70% chances to be right and, as there is no buying or selling actions the algorithm makes no gain. Both Sample 1 and 3 are making gains for three currency pairs, USD/EUR, USD/GBP and USD/CNY. But they also have an accuracy very close to the one of a random classification. Thus, the testing period in which the trading algorithm was investing might not be long enough to show that the actual gain is supposed to be lower.

Some currency pairs show better results than others. In two months and with an initial investment of USD 1,000, the algorithm using USD/GBP pair made up to USD 59 gain while the one using USD/TRY pair lost almost the same amount of money. However, the accuracies of the different classifiers are very close to each other. Actually, it is a bit higher for classifiers using USD/CNY exchange rate returns. Moreover, as said earlier, this currency pair is the one with the strongest correlation with the different parameters used by the machine. USD/CNY was expected to make better results compared to other currency pairs, but the gains remain positive for each sample, like with the USD/EUR return. The poor results given by the USD/TRY exchange rate classification can be explained by low correlations with our parameters, which are not higher than 10%, and by the geopolitical and economic instability of the area.

Conclusion

Overall, wordlists have better correlation than sentiment indicators. Moreover, it is hard to see a marked difference between positive sentiment and negative sentiment with the chosen sentiment indicators. Other ones, found in literature could be used for further research, to identify a real negative sentiment / positive sentiment and link them to market fluctuations. Moreover, the linear models couldn’t be properly validated due to instabilities and weaknesses. Indeed, each currency pair has to be studied on different periods, given the fact that political instabilities at different times for each country/economic zone, created breaking points. Results of the trading algorithm showed that the most accurate models are not the ones making money. However, as the machine has only traded for 2 months, it is possible that the gains will be balanced off with a larger period.

The trading algorithm and the linear model that have been created for the study were very simple, as they took only indicators extracted from Donald Trump’s tweets as inputs. The results demonstrate that those indicators alone are not sufficient to predict interest rates movements, but they show a trend. Those models might be stronger and show better results if other parameters are added such as the previous days performance of the given currency pairs and the fundamental indicators describing the concerned economic zones.

It would also be interesting to run an interdependency study: here, only the effects of the tweets over the exchange rates movements have been considered, can we measure the impact interest rates movements have over Donald Trump’s tweets sentiment indicators? Finally, the results of the study are not surprising, as the FX Market is extremely efficient. Most of the studies involving sentiment analysis focus on other markets, especially equities and commodities. The impact of Donald Trump’s decisions should be better measured with those asset classes.