Machine Learning – Page 2

Unlocking the Power of MapReduce: Using Python and Apache Spark for Enhanced Data Processing

Posted on November 26, 2016May 7, 2023 by Aakash Sharan

Hey there! So we decided to create a Word Count application – a classic MapReduce example. But what the heck is a Word Count application, you ask? It’s basically a program that reads data and calculates the most common words. Easy peasy. For example: dataDF = sqlContext.createDataFrame([('Jax',), ('Rammus',), ('Zac',), ('Xin', ), ('Hecarim', ), ('Zac', ), […]

The Art of Election Forecasting: Analyzing the 2012 US Presidential Election with Data Science

Posted on November 8, 2016May 7, 2023 by Aakash Sharan

Hey there! Let’s talk about this dataset from RealClearPolitics and the US Presidential Election. Before we dive in, let’s get on the same page about a few things: The US Presidential Election happens every four years. There are 50 states in the US and each gets a certain number of electoral votes based on its […]

Making Sense of Big Data: A Beginner’s Guide to Logistic Regression Training in SparkR

Posted on November 6, 2016May 7, 2023 by Aakash Sharan

Hey there! As your friendly language model, I’m here to help proofread and rewrite your text! Here’s the corrected and rewritten version of your post: Let’s do some Machine Learning with SparkR 1.6! The package only gives us the option to do linear or logistic regression, so for this exercise, we’re going to train a […]

Eliminating the Spam Menace: Building an Effective Machine Learning-Based Spam Filter

Posted on July 15, 2016May 7, 2023 by Aakash Sharan

Hey there! Let’s talk about spam filters. You know, those annoying emails that keep showing up in your inbox, even though you never signed up for them. Yeah, those. Well, a spam filter is a program that filters out those unwanted emails and messages. Pretty cool, right? So, we’re going to build and evaluate a […]

Transforming Data Analytics: An Honest Review of MITx’s 15.071x Course, The Analytics Edge

Posted on June 28, 2016May 7, 2023 by Aakash Sharan

Alright, folks! The The Analytics Edge course on edX is almost over and boy, have I learned a lot about Machine Learning in the past 2 months! This MOOC is hands down the best one I’ve taken so far, and I hope my other courses can at least live up to its awesomeness. I first […]

MIT’s Kaggle Competition Sees Fierce Competition Among Enrolled Students, with Overfitting a Concern for Some Top Contenders

Posted on June 6, 2016May 7, 2023 by Aakash Sharan

It’s an absolute thrill to be in the top 1% of the Kaggle competition hosted by MIT! This contest is no joke, with some seriously experienced ML implementers throwing their hats into the ring. And let me tell you, the top 3 are on a whole other level – they’ve achieved over 90% accuracy, which […]

Revolutionizing Baseball Strategy: Validating Moneyball Predictions through Machine Learning Models

Posted on May 11, 2016May 7, 2023 by Aakash Sharan

After diving into regression analysis, I couldn’t wait to test my newfound skills on some real-world data. Luckily, Kaggle has just the thing – they’re hosting a competition called ‘History of Baseball’ and, even better, they’ve provided a dataset for it! I had a blast analyzing Paul dePodesta’s predictions and statistical findings, using linear regression […]

Mastering Statistics Fundamentals: Key to Success for Machine Learning Engineers and Data Scientists

Posted on April 28, 2016May 7, 2023 by Aakash Sharan

A year ago, I stumbled upon a link in /r/dataisbeautiful on Reddit. I can’t recall the exact topic of the article, but it was from a site called 538. Being an avid reader, I started exploring other posts on the site and was pretty much blown away by the analyses. The data analysis fascinated me, […]