Tag Archives: Big Data

Four Steps to Accelerate Your Machine Learning Journey

This is the golden age of machine learning­ (ML). Once considered peripheral, ML technology is becoming a core part of businesses around the world, regardless of the industry. By 2021, the International Data Corporation (IDC) estimates that spending on artificial intelligence (AI) and other cognitive technologies will exceed $50 billion.

Locally, 25% of organizations say they are setting aside at least 10% of their budget for technology, which includes investments in big data analytics (64%), cloud computing (57%), Machine Learning and artificial intelligence (33%), and robotic process automation (27%), based on the Malaysian Institute of Accountants’ “MIA-ACCA Business Outlook Report 2020″. [1] As more companies gain awareness of the importance of ML, they should work towards getting it in motion as quickly and effectively as possible.

person using a laptop
Photo by fauxels on Pexels.com

At Amazon, we have been on our own ML journey for more than two decades – applying it to areas like personalization, supply chain management, and forecasting systems for our fulfillment process. Today, there is not a single business function at Amazon that is not made better through machine learning.

Whether your company is just getting started or in the middle of your first implementation, here are the four steps you should take to have a successful machine learning journey.  

Get Your Data in Order

When it comes to adopting machine learning, data is often cited as the number one challenge. We found that more than 50% of time spent in building ML models can be spent in data wrangling, data cleanup, and pre-processing stages. Therefore, prioritize investing in the establishment of a strong data strategy to avoid spending excessive time and resources on data cleanup and management.

person holding white printer paper
Photo by bongkarn thanyakij on Pexels.com

When starting out, the three most important questions to ask are:

  • What data is available today?
  • What data can be made available?
  • A year from now, what data will we wish we had started collecting today?

In order to determine what data is available today, you will need to overcome data hugging – the tendency for teams to gatekeep data they work with most closely. Breaking down silos between teams for a more expansive view of the data landscape while still maintaining data governance is crucial for long-term success.

Additionally, identify what data actually matters as part of your machine learning approach. Think about best ways to store data and invest early in the data processing tools for de-identification and/or anonymization, if needed.

Identify the Right Business Problems

When evaluating what and how to apply ML, focus on assessing the problem across three dimensions: data readiness, business impact, and machine learning applicability.

Balancing speed with business value is key. Instead of trying to embark on a three-year ML project, focus on a handful of critical business use cases that could be solved in the upcoming six to 10 months. Start by identifying places where you already have a lot of untapped data and evaluate if machine learning brings benefits. Avoid picking a problem that is flashy but has unclear business value, as it will end up becoming a one-off experiment.

Champion a Culture of Machine Learning

In order to scale, you need to champion a culture of machine learning. At its core, ML is experimentation­. Therefore, it is imperative that your organization embrace failures and take a long-term view of what is possible.

high angle photo of robot
Photo by Alex Knight on Pexels.com

Businesses also need to combine a blend of technical and domain experts to work backward from the customer problem. Assembling the right group of people also helps eliminate the cultural barrier to adoption with a quicker buy-in from the business.

Similarly, leaders should constantly find ways to simplify the process of ML adoption for their developers. Since building ML infrastructures at scale is a time and labor-intensive process, leaders should encourage their teams to use tools that cover the entire ML workflow to build, train, and deploy these models efficiently.

For instance, 123RF, a homegrown stock photography portal, aims to make design smarter, faster, and easier for users. To do so, it relies on Amazon Athena, Amazon Kinesis, and AWS Lambda for data pipeline processing. Its newer products like Designs.ai Videomaker uses Amazon Polly to create voice-overs in more than 10 different languages. With AWS, 123RF has maintained flexibility in scaling its infrastructure and shortened product development cycles and is looking to incorporate other services to support its machine learning & AI research.

Develop Your Team

Developing your team is essential to foster a successful machine learning culture. Rather than spending resources to recruit new talent in a competitive market, hone in on developing your company’s internal talent through robust training programs.

group of people sitting indoors
Photo by fauxels on Pexels.com

Years ago, Amazon created an in-house Machine Learning University (MLU) to help its own developers sharpen their ML skills or equip neophytes with tools to get started. We made the same machine learning courses available to all developers through AWS’s Training and Certification offering.

DBS Bank, a Singaporean multinational bank, employed a different approach. It is collaborating with AWS to train its employees to program their own ML-powered AWS DeepRacer autonomous 1/18th scale car, and race among themselves at the DBS x AWS DeepRacer League. Through this initiative, it aims to train at least 3,000 employees to be conversant in AI and ML by year end.


[1] MIA (Malaysian Institute of Accountants) and ACCA (Association of Chartered Certified Accountants), Business Outlook Report 2020, 2020

Machine Learning in Sports: A Paradigm Shift in Progress

Sports, data analytics and machine learning. Three words you would never expect to be in the same sentence, right? Well, what if we told you that they already are in the same sentence in sports teams the world over. That’s right, we’re already seeing the inclusion of data analytics and machine learning in sports – some even as early as 15 years ago. You’d be surprised how advanced things have gotten when it comes to data analytics and sports; we’re even seeing companies use Amazon Web Services (AWS) to help deal with and store the data.

In sports such as the F1, American football and even rugby we’re seeing more and more decisions being made when taking into consideration probabilities and numbers generated by machine learning. In fact, one of the sports most adept at using data is the Formula 1. Teams generate up to 600GB of data per lap from the 200 to 300 sensors in the cars. When it comes to the American NFL (National Football League) each player is analysed based on over 100 data points. These data points drive the plays we, as fans, cheer and look for when we watch the athletes play.

Dilemma: Where to store the data? How to capitalise on it?

When it comes to dealing with the data generated from these sports, the first dilemma is where to store the data. Of course, Amazon Web Services has a slew of container and data lake services such as Amazon S3 storage and more these teams are already using to store their data. However, just keeping the data in the cloud isn’t enough. They will need to run through and analyse the data for it to truly be useful to the teams. That’s where machine learning comes in.

While it might seem like a brand-new paradigm, we can assure you, that it’s been happening behind the scenes for quite a while. Teams in the F1, NFL and even rugby have been collecting data and analysing them to help players perform better, drivers drive better and engineers optimise their technology further. In fact, there are companies out there such as Pro Football Focus that actually process and analyse the data in real time. In fact, at AWS Re:Invent, Cris Collinsworth, CEO and Co-Founder of Pro Football Focus, said that what used to take coaches around two to three days to analyse is done in less time. He said that with this improvement, coaches are given more time to strategize and tweak their plays to help their teams win.

Photo by Chris Peeters from Pexels

The data collected during the races of the F1 doesn’t just go to the cloud for storage. Analysts on the ground are constantly looking at it to help tweak and make critical decisions for that edge. In fact, the data plays a big role in the teams pitting and undercutting strategies in a race. The engineers are also using this data to help with their car design and tweaking between races. However, the F1 has a pretty good head start compared to other sports out there. They’ve been using data analytics in their sport for over a decade now and have been able to use it to help with performance. However, that isn’t the only way they use their data, they also use it to create new regulations that affect the whole game and the welfare of the drivers.

Machine Learning in Capitalising on Collected Data

“We don’t do magic. We use technology to make decisions.”

Rob Smedley, Expert Technical Consultant, Formula 1

With the advent of machine learning in the past few years, the work of analysing the data has been made even easier. Using services like Amazon SageMaker, companies and teams are able to take advantage of the numerous data points in real time. Machine learning algorithms can churn out predictions and probabilities based on the collected data near instantaneously.

Image by Gerd Altmann from Pixabay

That said, the data generated by the machine learning algorithms is only half the picture. It informs the coaches and players of not only the probabilities and possibilities but also what could be done to help give the teams an edge over the competition. The decision making process on the pitch or track is no longer only a question of gut instinct, it’s about tempering and guiding the gut instinct with mathematics.

“The teams that are really embracing the new approach are going to win the championships”

Cris Collinsworth, Co-founder and CEO, Pro Football Focus. and Broadcaster for NBC Sports Sunday Night Football

We are at the crossroads of a change in sports paradigms. Coaches are beginning to accept that the data being processed by machine learning algorithms as guides for their game time decisions. The game is changing based on how teams are able to use and optimise machine learning to get the edge they need during game time.

Creating New Fan Experiences

That said, machine learning isn’t just giving the edge during game time. It’s also being used to create new fan experiences. Watching sports can become a pretty mundane experience for some. However, using machine learning and data analytics, broadcasters can create new experiences for fans to keep them more engaged.

Image by Gerd Altmann from Pixabay

In the United States, broadcasters have been experimenting using data lakes and machine learning to enhance the sports viewing experience. This isn’t just restricted to F1, NFL, NBA or the MLB. It’s across the board. These broadcasters are using machine learning to create overlays and explanations of complexities that help fans better understand the sport. In fact, with the amount of data they have at their fingertips, shout casters and commentators are able to see plays before they happen or even suggest some that would have led to a better outcome. These hints of information are also opening up the sports world to new audiences. It is also creating a more engaging experience for long time sports viewers and fans.

Given the amount of data being collected, it also comes as no surprise that broadcasters and even teams are looking into giving fans a better experience via a second screen. They are looking at what information would make sense and enhance the experience for viewers. Of course, raw data isn’t the answer but the data processed by machine learning algorithms are able to give a better understanding and appreciation to fans. In fact, they expect that it would engage a whole new type of viewer.

“You still need the human element”

Rob Smedley, Expert Technical Consultant, Formula 1

With all the emphasis on machine learning and data analytics, it would seem that sports will be reduced to 1s and 0s. However, as Rob Smedly highlighted, artificial intelligence and machine learning can never replace the driver or player. In fact, the thing that makes sports engaging is the human element in the game. It’s about how athletes are able to push boundaries of human performance and how we use the data to improve, not only the game, but also other aspects of human life.