The Regime of Monetisation

We live in a world where everyone is more digitally connected than ever before. Our daily activities are conducted through the devices and platforms we use, leaving a digital footprint for each individual. Previously the idea that a platform could send a message from Scotland to Sir Lanka within a split second whilst holding a library of photographs and connecting you with breaking news would have been branded as “alienated” or “apocalyptic”. Since the creation of the first webpage 30 years ago, internet usage has soared with over 4.5 billion people [1] worldwide having access.

During this rise, organisations such as Apple, Google, and Facebook have emerged and have established themselves as household names. Just like the “Big Oil” or “Big Pharma” movements, the rapid exorbitance of digital organisations such as those previously mentioned, have earned the elite title of “Big Tech”. Fabricated by Jim Cramer, host of Mad Money on CNBC, the acronym FAANG (Facebook, Apple, Amazon, Netflix, and Google) was generated for the “Big Tech” organisations due to their dominance within the stock market.

In 2020, the value of global big data was evaluated at “USD 138.9 billion” [2] with an estimated growth to “USD 229.4 billion by 2025” [3], making the once quoted “data is the new oil” [4] more of a fact than a statement. Alongside “big tech”, the evolution of “big data” has also surfaced. Data collection and analysis previously existed, being used in domains such as healthcare and stock markets, however, the introduction of “big tech” has dramatically increased performance with “2.5 quintillion bytes of data per day” [5] being produced.

Working in unison, “big tech” has allowed “big data” to increase the potential of every industry, discovering new inclinations in which can impact performance, consumer experience, product development, marketing growth, and more. It sounds good, right? Well, with rapid growth and dominance on a global scale, governments have become aware of the financial and political power-holding exposure these organisations possess.

Each search we conduct, like we make or item we purchase, our data is being collected but who has control over it? It’s easy to believe that we make decisions independently, or that our beliefs are completely true to our morals, but what if someone was tampering with the content you see, or the news you read? The right to voice our own choice hasn’t changed, however, the darker area is within the process that leads to our end result. Using our own recorded data against us via “privacy policies”, “high powers” can predicate and direct behavioural change through techniques via administered content, generating a mass response in opinions and views.

What is Data Mining?

So, what exactly are these techniques that are implemented to turn our own data against us? Just like a game of chess, there are many ways to get to your final destination. Previous data breaches such as hacking of networks and tampering of devices show the vulnerability of our security systems and the extent people will go to gain access. In more recent times, the emergence of more legal methods has surfaced, one of the most effective measures is data mining.

Data Mining Illustration

Data mining is the extraction process of usable data from large sets of raw data to seek connections or predictive trends. Using a combination of statistics, AI (Artificial Intelligence), and Machine Learning within algorithms, organisations can extract, understand, and target specific user’s information and weaknesses. Many different sectors have embraced the use of algorithmic data mining procedures such as banks, education boards, and insurers. This excels organisations data processing to an inhuman rate, allowing for accessible, and quick practises to be automated.

As mentioned before there are many ways to get to your final destination, data mining is no exception. Determined by the goal you wish to achieve and the type of information you wish to constrain, a range of methods can be implemented throughout the data mining process, being split into seven main categories;

I

Anomaly Detection

Anomaly Detection is the discovery of significant abnormalities within trends, patterns, or classifications that can raise attention from data sets. For example, if a hashtag on Twitter is typically used by white males aged 25-30 then suddenly there is a spike in black males aged 30-35, a detection of an anomaly within the data set is highlighted by the data mining algorithm requiring investigation.

II

Association

Association is the process that uses correlated variables of data to highlight the probability of execution based on similarity. A classic example of this is Netflix’s “because you watched” suggestion algorithm. Due to films and series being categorised with genres, sub-genres, tagging, cast, etc, Netflix’s algorithm recommends other titles based on association.

III

Classification

Classification collects various attributes from data sets classifying each byte of data into distinguished groups. Overtime data collected from these groups allow for algorithms to segregate each individual based on the favourable group. For example, Insurance quotations can fluctuate based on the criminal activity within your area which is categorised in a low, medium, and high vulnerability classification.

IV

Clustering

Similar to Classification, Clustering is the grouping of data based on similarities and connections. For example, the grouping of expenditure based on the living area of users through social media profiles or geo-location of photos taken, allows organisations to group demographics and target with specific advertisements such as payday loans for lower-income areas.

V

Pattern Tracking

Pattern Tracking is the concept where AI and Machine Learning algorithms discover recognisable patterns within the “big data” set provided. Typically recognising reoccurring habits or trends, pattern tracking detects when, where, how, and why patterns appear within individuals, groups, minorities, or geo-location. For example, the recognition that you order a number 5 from a local cuisine on a Friday night due to previous purchased data on a delivery platform.

VI

Prediction

Prediction is one of the most commonly used data mining methods. Using historical data and connections via association, algorithms calculate the probability of events occurring, ballots won, favourable winners, and more. Betting companies are a prime example of this, using predictive algorithms, odds are generated based on factors such as form, previous fixtures, the profile of players, value of the sporting organisation, and more.

VII

Regression

Regression is a technique that used calculative predictions to generate a continuous value based on key influences such as historical data or geographical location. An example of this is the property market, the range of a property’s value is determined based on data such as previous sales figures, and the average area income.

Auction Illustration

On a second-to-second basis, these specifically design algorithms are collecting a library of categorised information on each user. This results in a data point being awarded for each categorised byte of data collected. As time passes, these collections can turn into seriously informative and profitable databases that can be used to manipulate or persuade individuals.

Acxiom, an organisation specialising in data mining, once gloated that they had obtained for each of their 500 million clients, an average of 1,500 data points. [6] That’s one organisation that has 1,500 individual, categorised bytes of data of 500 million users. Suddenly what seemed like a thoughtless like or an interesting read, has now been used to establish a database of 500 million people, all sold to the highest bidder.

When in the wrong hands, data-mined information can arise problems especially when there is a conflict of interest or serious intent to harm. Biased restraints and influences can have catastrophic consequences especially when using algorithms to data mining research and implement upon users.

Algorithms may be automated but every single one was once created by humans. Naturally, humans are biased, if their opinions and beliefs are influential during the creation, the algorithm is already corrupt. The use of historical data is also a fundamental flaw potentially leading to corrupt results. “Big data processes codify the past” [7], if algorithms are collecting data based on historical and morally wrong factors, then previous generation biases are kept alive.

Survailance & The New Propoganda

So Why Your Data?

Our data is our identity. Every decision we make is influenced by the data we see or produced; it is a never-ending cycle. Think about each platform that you have signed up to. It’s most likely immeasurable, we give up our data for short term goals, ultimately resulting in 100’s of platforms, each having 1,000’s data points on you.

Social Media Illustration

It’s no surprise that social media platforms are ranked within the highest-scoring platforms for data points collection, as frequent activity results in data point scores soaring. But how does an organisation such as Facebook, a platform that is completely free for people to sign up and use, currently have an estimated value of “USD 720 billion” [8] if there is no income? The answer is advertisements, “If you’re not paying for the product, then you are the product.” [9] With 2.6 billion [10] active users, the data shared and collected by Facebook’s data mining algorithms is increasingly concerning. With a reported average of “52,000 data points” [11] per user, it’s no surprise as to how Facebook is exceedingly surging in value. That’s 52,000 different traits on each user, which are then converted into demographics used to target specific audiences through advertising mechanisms.

With such a vast amount of collected data, psychographic segmentation is increasingly being implemented to grouped consumers through demographic data, such as beliefs, ethnicity, occupation, and more. Demographic psychographics is determined based on connecting traits to isolate groups such as the barber we go to can determine your ethnicity due to types of service offered, associated pages, or hashtags. Using data points, psychographics can be extremely accurate and influential into specific groups, singling them out for direct viewing of content. Campaigners and marketers can these psychographic breakdowns to manipulate specific users based on their agenda and target market.

Of course, “higher powers” are also using these platforms to monitor, censor, and manipulate unambiguous groups of our population. Using our data against us, governments can project content within our everyday routines, changing our viewpoint or behaviour towards important economic and political issues.

2016 US Election

The US election of 2016 was one of the most controversial and dramatic campaigns in history. Republican party candidate, Donald Trump defeated Democratic representative, Hilary Clinton in what was predicted to be a landslide victory for the former first lady and US secretary. A sizable margin of victory was also predicted for the Democrat representative with organisations such as the New York Times and Huffington Post calculating percentage wins of 85% and 98%. [12] So how did Donald Trump turn the tide on what seemed to be a certain victory?

2016 US Predicted State Voting Map
2016 US Predicted State Voting Map

Within the US elections, there are three different types of states: Democratic, Republican, and Competitive. Having a good calculation as to which states were almost guaranteed wins, Trump opted for a digital approach to gain the required support from competitive states such as Florida, North Carolina, and Pennsylvania.

Labelled “Project Alamo”, Trump’s team would hire Cambridge Analytica, a data analytics organisation specialising in “behavioural microtargeting.” [13] Using social media platforms, particularly Facebook, Cambridge Analytica created algorithms that were data mining millions of US voters who originated within the competitive states. Using psychographic segmentation, they were able to pin-point voter’s pain points, building frameworks to exploit individual needs.

Although this wasn’t the first time Cambridge Analytica had been influential within a Republican’s campaign. Ted Cruz, a rival Republican candidate to Donald Trump, had previously hired the services of Cambridge Analytica to harvest detailed US constituency data using psychological profiles. Culminate in Cruz topping Iowa during the GOP polls, despite never previously leading.

Follow this minor slip-up, Trump still prevailed as the Republican representative. The appointment of Cambridge Analytica’s algorithm had collected “5,000 data points on over 220 million Americans.” [14] Using the data points collected, Project Alamo was able to detect the most vulnerable groups, using their own personal data against them. They targeted “on the fence” voters with “weapons-grade communication tactics” [15], publishing a series of individually tailored content via their Facebook such as their newsfeed, recommend videos, and sponsored advertisements. Between June and November 2016, Tump’s campaign had a total of 5.9 million [16] advertisements that were published to Facebook, compared to 66,000 [17] advertisements from Clinton, there was a dramatic differentiation. Trump may have been communicating through Twitter, but his victory would be won through Facebook.

The idea wasn’t to change voter’s opinions but to demobilise their will to vote. All that was needed was a majority percentage but getting people to change their vote proved more challenging. Using the segregated data points collected by Cambridge Analytica, Project Alamo focused their digital advertising content on minorities including 3.5 million African Americans [18], in a bid for them not to vote.

Donald Trump Glitch

Within this database, many minorities were labelled “Deterrence”. [19] Using media content on social platforms like Facebook, Project Alamo was able to detect and manipulate groups through juggernaut propaganda. Using this method of segregation, Democratic voters effectively didn’t show in the numbers expected, allowing Trump’s campaign to gain the majority vote in deceive swing states. With states such as Florida, Ohio, and Michigan all voting in favour of Trump, what seemed a near-impossible challenge, had been achieved. Resulting in 304 electoral votes compared to Clinton’s 227 [20], Trump had successfully turned the tides on the election, becoming the forty-fifth president of the United States.

China’s Privacy Power

In one of the most digitally leading countries in the world, the People's Republic of China is rapidly introducing new measures to comply with their innovative technology. Within the past ten years, there has been a surge across the globe in the use of Chinese technology. Crafted and designed devices originating from China such as Huawei and DJI have emerged as serious contenders within the global market.

As well as the private sector, the creation of innovative technology is being implemented within China’s population due to the government’s initiatives. The evolution of “Smart Cities” aims to achieve a new way to tackle economic and social issues, stating their plan is to create “a multidimensional and systematic project, which requires a high degree of interconnectedness between urban planning and smart city IT infrastructure.” [21] With their technology reputation on the rise, China envisions this being a crucial component to the success rate of its “smart cities”.

Beijing, China Glitch

It is not only standardised data collection that is being used to connect these “smart city” systems but every detail of each citizen’s lives. One of the biggest types of technology leading from China is AI (Artificial Intelligence). Within these smart cities, AI is being used to identify and arrest people using face detection. From the outlook, this may seem like a good deed, but ask yourself, when exactly did anyone give permission?

With the rapid development of technology originating from China, the government have been forced to update existing and produce new privacy measures. Previously China did not have a generalised data privacy legislation like the EU’s GDPR, resulting in citizen’s data being completely exposed. In the mix of their installation of legislations, two specific legislations are causing great concern.

Introduced in 2017, The National Intelligence Law caused global concern with the protection of government’s, organisation’s and citizen’s data being at risk. Within this legislation, organisations that originate from China would have to provide data to support and co-operate with state intelligence. This legislation does not only apply to the citizens within China but also citizens visiting other countries and global audiences using Chinese technology services.

In 2020, The Data Security Law was preliminarily approved by the government of China. This legislation was introduced to close the gap of data protection by mandating data processing. Previously it was up to data processors and individual users to protect their data, resulting in its exploitation. However, there is ever-growing concern about the details of this new legislation.

All organisations that originate or conduct “data activities” within China must abide by this new legislation. Providing “important data” which is determined by the government including risk assessments, reporting of information, sharing and monitoring of data, as well as an early warning of potential data risks. It also states that private sectors will have to supply data which deemed “conflicting” to the with the government’s interest or security.

Previously Hong Kong’s government had significant sovereignty over data and internet usage. In a bid to unite one nation, China’s new legislation now applies to Hong Kong with Chinese officials having control of monetisation, titled “The Great Firewall of China.” [22] In light of this development, organisations such as Google, Facebook, and Twitter have halted data processing and requests within Hong Kong.

Land of the Free?

With China’s Great Firewall going from strength-to-strength in government surveillance, other global identities have begun to take action, one being the United States. Given the on-going political rivalry between the two countries, it’s no surprise that Trump issued the ordering of all US organisations to remove production from China, even suggesting “bringing your companies home.” [23] The government also issued the removal of US software from Chinese tech products including Android, in light of intelligence involving Huawei devices and potential “spying” of US citizens. This was a clear defence tactic to prevent “big tech” organisation’s data from being incepted by the Chinese government.

The retaliation did not stop there. The US government also issued the banning of many Chinese organisations, including the right to sell or use services within the US. As a result, organisations such as WeChat and TikTok are the latest Chinese-based organisations to come under the US banning radar. TikTok an application created by ByteDance has recently exploded in popularity throughout the globe. With a monthly average of 800 million [24] users, the US government issued the prohibition on transactions due to the safeguarding of national security, with President Trump stating, “TikTok needs to free itself from ByteDance's control, or be shut down in the U.S. for good.” [25]

With the prohibition order, ByteDance had been placed in a difficult predicament during the crossfire battle between the US and China, which was beginning to show similarities to “The Space Race” between the US and USSR during the Cold War. With no other option of survival, ByteDance was being forced to sell 20% [26] of their platform. Following the interest of Microsoft, Trump stated his approval for an American based cloud platform Oracle to complete the takeover. This comes as no surprise, given Larry Ellison’s, (CEO of Oracle) close relationship with President Trump.

From the outset, this seems like a great deal for all Americans, however, once you uncover the details of the deal, the reason for interest becomes much clearer, leaving us questioning if the deal really was about national security or was it for accessibility to US data? With 41% of TikTok users between the age of 16 and 24 and 15% [27] of active internet users through the US, it is with great interest that Trump gained an advantage on this platform. We are seeing a change in a new generation’s political views calling for action in climate change, equality of rights, and more. Gaining the advantage of the TikTok platform would drastically increase Trump’s popularity and voting percentage within this demographic.

With trends typically reaching millions of views and being re-created with profile accounts having 800,000+ subscribers, it’s no surprise to see political campaigns growing on this platform. Using this high volume of user activity and networks, political campaigns can data mine user’s behaviours, manipulating suggested content for segregated audiences.

Just like Twitter, the hashtag is also extremely powerful on TikTok. From January to June, the hashtag #Trump2020 had attracted over 3.4 billion views, [28] to put that into context the #blacklivesmatter had 2 billion views [29] within June. With younger voters being more active, their desire to vote for change is stronger than ever. Trump’s strong relationship with Oracle, enables exclusive access to all US data, censorship, and the algorithms used to populate content. With a margin of 107,000 votes [30] from three swing states deciding the 2016 election, pushing propagandised content to younger voters can have a “brain-wash” effect towards supporters and enables for “deterrence” [31] labelled audiences the refusal to vote again.

The Future of Our Data

Whether it is for the structuring of advertisements, manipulation of online content, or high-level smart surveillance, our data is at the fundamental core of it all. Action is needed to be implemented, ensuring we have direct authority over our data. We should not have to worry about the consequences of each action we conduct and the dramatic impact on future content. After all your data is your story [32], it the digital diegesis to your life, so why should government bodies have complete access to infiltrate and exploit your infirmity?

Our data is extremely easy for “higher powers” to gain access and control, however, when an individual requests to access their data that has been mined and segregated, their proposal is swiftly denied. David Carroll and Heather Brooke are two individual cases that submitted for this request. The result, a lengthy legal battle, and by the time these individuals gain access to their data, it has already been utilized by the “higher power” for the intended purpose.

Within a vicious cycle between the individual users, the organisations, and the government body, it is near impossible to generate an evenly balanced solution. Policies are generated by the organisation, but those policies are dictated by the regulations and legislations generated by the national governing body. One thing we can call for is transparency. The use of dark methods embedded within algorithms has been concealed through aesthetically, delightful user interfaces. “They’re just hiding the flaws in their model, and hoping you won’t ask too many difficult questions.” [33] Calling for a transparent process of where, when, and how your data is being collected, categorised, and implemented, allows for each user to see exactly where their data is being captured and utilized by both organisations and government bodies.

Former Business Development Director of Cambridge Analytica, Brittany Kaiser has dramatically changed her view on individual data rights, following her involvement with Project Alamo’s data mining scandal. Changing her career path to highlight the awareness of data rights, Kaiser co-founded the #OwnYourData movement. Through a recognised global entity #OwnYourData’s mission is to educate and create awareness about digital intelligence. Engagement through the hashtag on social media has enabled pro-tech awareness organisations, groups, and individuals to collaborate and share insights about the exposure of dark methods used by governments and organisations, as well as developments within global data rights.

Privacy policies can be extremely shady, hiding important details about data within section articles. We often accept privacy policies not knowing what exactly we are agreeing to. Ask yourself; have you ever read the full terms and conditions when signing up to a platform? With digital, regional, and international legislations constantly changing, organisations typically update their privacy policies without properly informing us. The term “Zuckering”, [34] was generated from Facebook’s privacy policy scandal where users were ill-informed of important changes during a mandatory refresh of their privacy policies. If you agreed to the original policy without customising your settings, previously private data such as your gender or living location was made public.

To make privacy policies universally inclusive, users must have a clear informative prompt during the entail sign-up process and when changes are made. The ability to opt-in or out of policy articles or amendments needs to be provided and respected by every platform. We understand that platforms like Facebook, Google, and Twitter need advertisement revenue to survive, however, these organisations need to respect your data rights. You are entrusting each platform with your data, the safeguarding of personal data needs to be affiliated and regulated, without reading 100+ pages of privacy policies.

Restructuring the creation and approval process of algorithms is required immediately. Biased algorithms are processing our data through unjust calculations and considerations, resulting in renewed bias conclusions. Using controlled standardised measures, we can administer algorithms to support our goals however, they must be compelled by our ethical and moral values. Controlled restraints within algorithms will allow for data to be processed fairly generating balanced conclusions that aid and improve society.

Conclusion

The age of “big data” is truly among us. Our lives are monitored through every action we make on every platform. Organisations are collecting, manipulating and selling our own data, without consulting us for our approval. With the world increasingly becoming more connected and confident when using online platforms, these organisations are covering glorified conniving algorithms with delightful user interfaces.

Data Cycle

Rather than searching and enforcing new legislation upon these organisations, governments are abusing their authority by controlling legislations and gain leverage from these algorithms, even developing their own data mining contraptions for specific audiences. Action is far and few, with governments, only enforcing punishments when insightful data is caught by sources and released to the public realm.

Within the past four years, awareness of these dark, unethical and immoral methods has started to surface. However, the retaliation and measures to protect our personal data have not materialised. Currently, we have no control over our personal data when using online platforms. Whilst we seek justice for noticeable acts of physical corruption, acts of online inequity are progressing without facing ramifications.

Throughout the globe, we have praised the emergence of innovative technology, whilst never condemning or questioning their actions. Whether it is a corrupt algorithm, platform or organisation, they all started and are influenced by human behaviour. We need to highlight the importance of our data rights and that dark methodology such as data mining and segregation are wrong to our values.

To progress as a society, we need to remove bias from algorithms, whilst safeguarding platforms through a user first initiative. Governing bodies and platforms have become narrow-minded focusing purely on profit margins and self-gain. A new method of authorisation is required, the days of agreeing to long and confusing privacy policies are over. The ability to approve what organisations and “higher powers” can access our data needs to be inaugurated. Transparency of our data needs to be imposed, giving us clear informative insights on what data is being collected and exactly who is accessing it. Only once these measures have been endorsed can we truly move forward as a fair society. We all want to embrace these platforms, it's time these platforms start to embrace us.

Bibliography