Read: Mistakes I made when using Twitter data for the first time

Many of these trip-ups are addressed in Twitter’s documentation, so while it’s okay to dip your toes into web scraping and bot-making without reading everything, make sure to flip through the field descriptions and limitations of the API before doing anything important! …


Image by author

A dataset of Fortune 100 tweets during BLM protests reveals corporate America’s awkward relationship with social justice.

After the death of George Floyd on May 25 and the subsequent protests that peaked two weekends after his death, any tweet that didn’t address arguably the largest protest in American history was a faux pas at best, and at worst, implicit support for the status quo. …


The web app uses parsed headlines from the most highly rated Florida Man subreddit posts of all time.

Below is a quick overview of the numbers behind these headlines and here is a link to a web app where you can explore the…


How a simple three-dimensional structure reduces error, outcompetes more complex models, and doubles savings.

THE STRUCTURE


The Weakest Link, a British TV quiz show, ran its last episode in 2017. Did players miss their chance to make the most money possible?

A Monte Carlo simulation allowed me to calculate average earnings when comparing stopping strategy to percent accuracy. Here…


In some departments, the number is as high as 85 percent.

  • Cruz ’92 — “Clipping the Wings of Angels: The History and Theory behind the Ninth and Tenth Amendments of the United States Constitution”
  • Kagan ’81 — “To the Final Conflict: Socialism in New York City, 1900–1933”
  • Shields ’87 — “The Initiation: From Innocence to Experience: The Pre-Adolescent/Adolescent Journey in the…


A 99-point error spread

The third-party Twitter apps aren’t built to be used on accounts with millions of followers. Of course, that’s what users did anyway.

One fake-follower calculator created by SparkToro, a startup claiming approximately $1.8 million in funding, even uses machine learning algorithms to separate real accounts from fake. But a bottleneck in how much data Twitter allows third-party developers to access in a given time has forced these web apps to use a tenuous definition of “random sample” for accounts with millions of followers.

The math is wrong.


Research in coordination with the Open Modeling Framework.

Forecasting technology has given utilities an opportunity to flatten their load curves, raising a whole new family of questions. Below are solutions to important questions that can save utilities a good deal of money by reducing capital and operating expenses from peaking power plants. All testing can be found here.

This research can also be viewed on my website:

Part I: What’s tomorrow’s load?

Main takeaways:

  • To get any kind of useful energy consumption forecast, simple machine learning isn’t appropriate. Deep learning, however, can get us the accuracy we need.
  • Given historical load and temperature data, a straightforward neural…


Peak Shaving with Neural Networks: Part III

How one 19th-century physics equation can increase electric utilities’ savings by more than 60%

This is the third in a three-part series about peak shaving with neural networks. Consider checking out the other two:

Even the best models for predicting energy consumption aren’t good enough to capture a majority of the possible value of peak shaving. When a forecast has just 3% error, it’s not unusual to lose half of possible savings as a consequence. Consider how the smallest inaccuracies dramatically affect these utilities’ expected savings from peak shaving (testing here):


Peak Shaving with Neural Networks: Part II

Electric utilities can detect monthly peaks with only a three-day forecast.

This is the second in a three-part series about peak shaving with neural networks. Consider checking out the other two:

For electric utilities, reducing monthly demand charge can be hugely profitable. Implementing a peak shaving strategy every day, however, could be costly. If a utility is using direct load control (paying customers to turn off air conditioners, water heaters, etc.), they may frustrate customers if they do so too frequently. If a utility uses storage, overuse can force them to replace expensive batteries more often than necessary. Therefore, it’s not only important…

Kevin McElwee

🏳️‍🌈 Machine learning engineer and data journalist. Learn about me and my projects at www.BrownAnalytics.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store