Satyamev Jayate & Big Data Analytics
I just viewed a webinar by Persistent Systems on “Big Data Analytics enhancing customer engagement” about Satyamev Jayate, the TV show by Aamir Khan about bringing out social issues in front of public. It was cohosted by Jonathan Dotan – President of Media and Digital Strategy at Future Group, Mukund Deshpande – Head – BI & Anayltics Competency at Persistent Systems and Lalit Bhagia - VP and Digital head (Internet and Mobile) at Star TV
Here are the notes from that session –
What it was about (Satyamev Jayate)?
- Link - http://www.satyamevjayate.in/
- Some of the topics on which people were not comfortable discussing in Indian society
- 90 mins show
- Telecasted at 11 AM to 12 PM in morning which was prime time 10 years ago but not these days
- Contained very thorough statistics and research about the topics
- Simultaneously telecasted in multiple languages and had unbelievable reach in India
Why analytics was necessary?
- Key was understand the pulse of audience
- Goal was not to have revolution in society but the main focus was to change the perception of individual
- To measure the impact of the show. Have audience been touched? How to measure the success?
Some Statistics about the show-
- Most talked about show in Indian TV shows history
- 13 episodes, Television reach – 500 million
- Over billion impressions (FB, SMS, and other social media)
- 64 million engagements
- 1.45 million Facebook fans for just 13 episodes. (IPL – 1.2 million fans after 5 seasons)
- Was top of twitter trends in India every week show aired and twice in global trends
- 843 cities in India and 5435 cities across world
- Approx. 1.2 billion connections, 15 million+ responses, 8 million+ community members
- Average response content had average 100 words per message. Rich and quality content!
- Read More - http://www.satyamevjayate.in/impact/impact.php
Data Analytics helped the show close the loop with audience?
- Understand how sentiments of people came off. Example: first episode – 99% of people responded positively. Data was sent to Chef Minister of Rajasthan and he took the action
- To measure impact of the show on the society
- Analyze the response with different demographics
How they did it? – The persistent story
- Stakeholders Expectations-
- Amir khan productions –
- Whether goals met or not?
- Responses, sentiments
- Star India –
- Viewership audience/analysis
- Impact of the show
- Satyamev Jayate Field research team
- 360 degree view of social issue
- In-depth analysis of the topic
- Accurate and instant statistics
- Weekly cycle
- Sunday – 90 mins show. Collect the information from various social media channels (Twitter, Facebook page etc.) after the show
- First deliverable of the team – on Tuesday or Wednesday on show happened last Sunday
- Publish the story on website
- Featured feedback
- On every Friday –
- Finish all analysis by Friday. Includes -
- Social graph of influencers
- Detailed analysis about show happened last week
- Show impact analysis by Amir Khan where he would discuss the impact he seen on society about the topic discussed of last week
Get the content from various mediums
- Social media/websites – > Bots to crawl and store data into database
- What technology used? –
- Separate connectors for every social platform. API to pull data from Twitter – Data partner, Facebook – Directly from page through APIs
- SMS/IVR- tie-ups with providers. Directly pull data through APIs
- Processing Filtering the content, Ranking the content
- While show was going on live content from twitter was used as a baseline and analyzed to create tag taxonomy
- Almost 80 tags for every show. Can analyses these on various demographics
- Sentiment and emotional analysis
- Every message was assigned the sentiment analytics score
- Live insights were shared after every hour
- Unstructured data – How to tag/rank them?
- Every show was different topic – Tagging was extremely difficult
- Topic was out on Sunday morning
- Need to create tag taxonomy
- Mix of the languages – Hinglish. Adopt the algorithms to mixed sentences
- How regional languages were dealt with?
- IVR channel – Manual transcribers for Indian language as no speech to text technology available for Indian languages
- Most of the words were Hinglish – > Create custom dictionary and custom platform to analyses
- Unique algorithms used?
- Topic evolution – Identifying the keywords and identify where discussion was evolving
- Taxonomy identification – automated approach
- Ranking and sentiments – Mostly sentiment analysis algorithms used.
- Influence analysis – Mix of visual + computational techniques
- Basic scoring technique
- Visually identify the influencers
- Data storage?
- Every box in diagram has their own storage of HDFS
- Charts were stored on MySQL databases
- Cloud sourced platforms has their own RDBMS and UI
- Visualization tools Key was to check if the goals of the show were met or not?
- Persistent built Google clustered map maker to show audience response across the world
- How every topic response in every part of India/gender. (Different dimensions of demographics)
- Dashboard showing best content in terms of videos/stories (ensure that Content is not biased) (updated hourly)
- Advanced analytics dashboard and visualizations –
- Impact calculation of show
- Detailed dashboard for Amir Khan Production house for analysis and impact calculations
- Example: How young audience reacted to the show?
- Animated 3D globe (available on website)
- Reporting tools
- Quick view
- Amount of donations? And mechanism?
- People sending SMS as nominal grade. Revenue generated from SMS went for that cause
- Website – Enabled people to donate the money for that NGO
- Reliance foundation – Every money collected, they will equal the same amount and donate it
- Used Axis bank’s payment gateway
Update on 27th Sep 2012: The recording of webinar is available here