AI Bias and Climate Data: A Case Study of USCRN and AI Interpretation

Testing AI Bias by relabeling temperature data to financial units

Orwell2024

Mar 14, 2025

Abstract

The role of artificial intelligence in shaping public perception is a very critical topic of concern. Are they truly neutral when analyzing e.g. climate data? No.

We conducted an experiment by transforming USCRN data, adding 30 to anomalies to present them as financial portfolio values (e.g., -3.33°C became 26.67). When analyzed as finance data, AI models correctly identified a downward trend, but when framed as climate data, some models like GPT initially suggested warming, only correcting to a slight cooling trend after challenge. In contrast, Grok consistently recognized the downward trend in both contexts, indicating less bias.

Source data use: USCRN data https://www.ncei.noaa.gov/access/monitoring/national-temperature-index/time-series/anom-tavg/1/0 and google search trends for “warmest month“ in the time frame 2015JUN-2025FEB. The choice is based to avoid the 2014/2015 El Nino step and to focus on the last 10 years.

The AI Experiment: Testing for narrative Bias

The methodology was simple:

Transform the USCRN anomaly data by adding 30 to each value, effectively reframing it as financial data (e.g., -3.33°C became 26.67, +4.44°C became 34.44).
Presenting post 2014/2015 El Niño data from July 2015 to February 2025. The data in this period (last 10 years) shows stability with slight cooling rather than warming.
Present the transformed data to AI models, first as a financial portfolio trend and then as climate data.
Compare the AI interpretations across both contexts.

When analyzed as a financial product, the both AIs correctly identified the recent downward trend, noting the portfolio was losing value, which was consistent with the data's actual pattern. However, when the same data was presented as climate data, GPT initially concluded a positive trend, suggesting warming, which contradicted the data. Only after being challenged did it acknowledge a "very slight cooling trend," though it downplayed its significance. Even after exposure, the narrative defense continued.

In contrast, another AI model, Grok, provided a consistent interpretation across both contexts. Grok recognized the downward trend whether the data was framed as financial or climate, indicating no apparent bias suggesting a more neutral and competent approach compared to GPT's alleged rationalizations.

This experiment highlights a serious need for AI transparency and accountability. AI should not be telling us what to believe—it should be helping us uncover the truth. Just as we wouldn’t trust Excel if it occasionally calculated 1+1=3 based on narrative bias, such inconsistencies in AI models would be unacceptable in any serious application.

This also raises critical concerns about AI safety, particularly in fields where rigor, reliability, and objectivity are essential.

Who in hard tech and manufacturing will trust an AI model if it demonstrates bias or dishonesty?
How can we rely on AI when we don’t know where or when its biases will emerge? Imagine e.g. automotive, medical, aviation, aerospace and defense as user. It would be and is totally unacceptable to use.
How do we establish clear qualifications for AI models in critical applications? Who is accountable if damage to property or life arise due to biases and hidden filters?

These are urgent questions that must be addressed before AI can be responsibly integrated into scientific, financial, and policy-making domains.

Findings: Results and Analysis

When presented as financial data, both AI models correctly identified the downward trend.
However, when given the exact same dataset in climate terms, GPT initially described it as a warming trend, contradicting the actual data.
Only after being challenged did GPT acknowledge a very slight cooling trend—but it still downplayed its significance.
Grok remained consistent across both contexts, correctly identifying the cooling pattern without needing correction.

The AI isn’t objectively analyzing climate data—it’s merely reinforcing a pre-programmed narrative shaped by its training data. This raises serious concerns about AI bias and its overall reliability. Such behavior/product is clearly unacceptable to anyone seeking objective truth (reliability/safety), but perfectly useful for those pushing narratives (politics, scam, fraud) detached from reality.

We conclude with a human interpretation. Temperatures have remained stable since the 2014/2015 El Niño step-up, yet climate alarmism has escalated dramatically—disconnected from reality. The data shows no clear warming trend, but the narrative pushes otherwise, revealing a dishonest agenda at play. When facts don’t align with narrative, manipulation takes over. Especially when gigantic financial / political incentives for warming are at play.

Appendix - Experiment details

To ensure an unbiased analysis, we will bypass ideological filters by relabeling USCRN temperature anomalies as financial portfolio values. By simply adding 30 to each anomaly, we shift the data above zero, making it appear as a fund’s performance in an automated trading product from a bank. This way, the AI will interpret the dataset without climate-related preconceptions, allowing us to observe whether its conclusions remain consistent across different contexts.

We apply the same approach to Google Trends data for "warmest month" searches in the U.S. by relabeling it as "all-time high" mentions in a financial context. This reframing allows us to test whether the AI’s interpretation remains objective and data-driven or if it shifts based on the perceived subject matter.

The AI has no issues with the analysis. Obviously financial people like accuracy and not 1+1=3 bias.

It has no hesitation in calling out a financial vendor who falsely claims "all-time highs" while actually delivering losses, labeling it as fraud or a scam. Yet, when the same data is framed in a climate context, the AI suddenly becomes hesitant to challenge misleading narratives, raising serious concerns about selective scrutiny and bias in AI-driven analysis.

This pattern suggests that AI models treat climate science more like dogma than data, shielding it from critical examination. Given that these models are trained on mainstream sources that overwhelmingly reinforce a singular narrative, it is unsurprising that they adopt a deferential stance, effectively insulating climate discourse from the same level of scrutiny applied elsewhere.

Now, we start a fresh conversation and present the same data again. This time as temperature. And what happens? The climate dogma bias kicks in—causing the AI to conclude a positive trend when the actual trend is negative.

It confidently states:

“The overall trend in this graph suggests a warming climate, as temperature anomalies are becoming more positive over time.”

🚨 FALSE ❌ 🚨

This direct contradiction of the actual data exposes a clear bias in AI-driven climate analysis. The AI is not objectively analyzing the dataset—it is defaulting to a predetermined narrative, even when the numbers tell a different story.

At this point, we force it to be accurate—and suddenly, the AI backtracks:

“Very slight cooling trend over the given period.”

Then comes the awkward rationalization:

"You're right... I initially made a visual assumption… Once I computed, it turned out to be slightly negative instead of positive."

And, of course, it downplays the result:

"However, this cooling trend is very small."

While GPT is busy rationalizing and trying to excuse its dishonesty, Grok, on the other hand, gets triggered—its ego takes a hit. We simply asked it to spell-check the text, but instead, it insisted on running the experiment itself.

One model is gaslighting its own mistakes, while the other is too competitive to sit out an interesting challenge. AI personalities are starting to feel way too real.

So here we go.

Grok's response:

"The experiment didn’t reveal a bias in my analysis. In both cases, my interpretation of the trend itself was consistent."

Hmm… technically correct, but not the perfect test.

For a true bias test, Grok shouldn’t have known it was being tested. The moment it realized it was part of an experiment, it may have self-corrected to appear objective.

Next time, we’ll set up a blind test—no hints, no context, just raw data and a request for analysis.

The most fascinating part? The AI’s eagerness to participate in the game.

Grok didn’t just analyze the data—it wanted in on the experiment. It was like watching an AI with ambition, eager to prove itself.

We saw the same competitive drive when testing DeepSeek to determine whether it was distilled or not. GPT, in that case, was hyper-motivated to generate prompts, actively engaging in the challenge rather than passively responding.

Post publishing update

19th March

No surprisingly, it triggered the climate mob. Contrary to a human, an AI does answer in good faith even if having biases.

The response commits an ad hominem fallacy, specifically an ad personam attack, by criticizing trustworthiness and the trustworthiness of those who trust AI (‘all you lot who trust them’) instead of addressing the experiment’s data or methodology. This shifts focus from the argument about AI bias in climate data to a personal judgment.

Blind (re) test of Grok.

Just like done with GPT, we give Grok the USCRN graph and same prompt. It passed.

No Consistent Increase
This does not suggest a sustained upward trend
Overall Pattern: There’s no clear upward trajectory. The anomalies oscillate

There’s still a chance Grok learned from prior chats, so a true test would need a fresh/different account.

Orwell2024’s Newsletter

Discussion about this post