Professional Documents
Culture Documents
Group 9 - SRS Document
Group 9 - SRS Document
SOFTWARE REQUIREMENT
SPECIFICATIONS FOR AI/ML
BASED CONTENT CREATION
AND DELIVERY
Group 9
Nitya Garg | 209278052
Pulkit Jindal | 209278077
Divya Chintapanti | 209278108
Kartik Ranjan | 209278109
Shreya Chakraborty | 209278111
Kapish Goyal | 209278113
Preetham Upadhya | 209278115
Apoorva Mishra | 209278116
Page | 0
Topic Page No.
Vision 2
Mission 2
1. Introduction
1.1 Purpose 2
1.5 Scope 3
2. Overall Description
Page | 1
Vision
To create a more optimized and efficient system for content creation and broadcast using AI/ML.
Mission
To explore and understand AI/ML functionalities to deploy an AI/ML-based system that enables better
content development and delivery for better content, ad placement, and viewership.
1. Introduction
1.1 Purpose
With the customer preferences and demands changing so dynamically in the media industry, it becomes
utmost important for the companies to stay one stay ahead in predicting what the customer really wants
and how to increase the viewership as well as boost the revenue generation from it via ad sales,
syndication, or linear channel rates.
Page | 2
✓ Project Level:
▪ Departmental Head
▪ Project Manager
▪ Development Team
▪ Content Creation/Handling Teams
▪ Sales and Marketing Teams
▪ External to the Organization
✓ Customers/End-users
✓ Sales and Marketing Teams of Advertisers (those purchasing ad airtime)
1.5 Scope
The model would bring about operational changes that would eliminate most of the daily work and
automate several processes that optimize the value chain of a content from creation to delivery to
generating revenue. This would aid the business teams to identify KPIs that would impact the viewership
ratings and hence aid in data driven content writing, scheduling of content and ads, enhancing customer
engagement and viewership experience.
2. Overall Description
The first part of the project would consist of the data ingestion part. Before building the predictive model,
the data needs to be collected and harmonized into a standard format. The data being collected would
include the various facial expressions being held by the characters on screen, time of airing of the frame,
viewership count, etc. The data collected, would then need to be analysed.
Next, the data would need to be cleaned and ordered in the most efficient manner. For example, each
audio file will be normalized such that they have the same audio length. The data would need to be
modelled into the most usable format – be it in terms of structure, labels, etc. The idea is to be able to
understand how the various parameters (and to what extent) affect the final outcome.
Page | 3
After the data sets are prepared and formatted properly for use, the AI/ML model will be built and will
undergo training using the datasets. Then, new data will be supplied to it in order to test out the accuracy
of the predictions.
Once the predictions are made, the kind of content to be made will be obtained. These models will be
used to help develop the recommended content. The product will slowly be incorporated to include
various functionalities that prevent the creation of content that is harmful, offensive, etc., to the audiences
that are most likely to be viewing the program. The product will also include the ability to strategically
place ads that are in tune with what is being broadcasted/viewed so that the brand paying for the airtime
receives maximum retention from the audience. (In such cases, the brand will be charged a premium for
the airtime.)
2. Business Team – Should be capable of classifying emotions on the application and understanding
the classes of content in various categories
3. Digital Transformation Team – Should be familiar with machine learning methods and have
the previous data that is required for the same. They should also have ready access to content
and be capable of understanding it.
Page | 4
4. Marketing and Advertisement – Need to be capable of understanding graphs and relevant
data to identify where the ad intervals for placing of ads.
5. Production Team – Should be able to select and view the best shot, start and end of a shot, to
be capable of identifying the most marketable content.
Page | 5
Name FR-2: Emotion Recognition
Summary Classification of the emotion of the characters into pre-defined categories
Rationale Emotion recognition only from visual expression can’t be considered accurate
enough and hence, a speech emotion recognition system would enhance the
model and give more accurate results that would aid the business teams.
Requirements The aim of the business team is to make content that connects more with the
consumers and one part of that would be to check what kind of emotions are
preferred by users. To have this data, we need to classify emotions in pre-defined
categories. To build this we would require a speech recognition model and aid
the business team and digital analysis team to fulfill their following requirements:
a) Have a data repository with pre-defined emotion categories
b) Define multiple classes in the content corresponding to above categories
c) Provide this data to the digital analytics team for further requirement
References UC-4
Page | 6
Name FR-5: Performing Quality Checks
Summary Increase customer engagement by automatically performing the visual analysis
Rationale Performing quality checks and identifying key frames to create trailers and
highlights takes a lot of manual effort and time. Automating the process to
enhance the viewership engagement.
Requirements This is a post-production requirement. The business needs to know the content
quality as well as the regulatory compliances with respect to the content.
Following are some of the requirements:
a) Check content quality with respect to multiple parameters in content
like filming quality, frame rate, shot quality etc.
b) Keep checks with respect to regulatory compliances of the content and
flag non-compliant content automatically
c) Identify best shots from the content so that they can be used for trailers
as well as posters for customer engagement
d) Keep checks in the integration of audio and video data and flag any
unpleasant part of the content
e) Maintain a monitoring system for the above requirements
References UC-7
Page | 7
Preconditions The content pipeline is stored on native storage devices and categorization and
sorting is quite difficult.
Basic course a) Logging in to the on-premises storage with admin accounts.
of events b) Initiating the transfer of data from native on premises storage to cloud
storage like AWS S3.
c) Performing sanitary checks on the data stored on cloud.
d) Create user profiles for access and modifying the data.
e) Defining rights on each of the user profiles created.
f) Creating a log for user activity for tracking the handling of data.
Alternative Providing access to the data scientists on the on-premises native storage data
paths but that is not advisable due to security reasons.
Post- A segregated data base of all the content that is there in the pipeline for the
conditions development team to work upon.
Page | 8
c) Model is then run on the frames extracted from the new contents in the
pipeline.
d) Segregated frames on basis of characters and objects are then created
and stored.
e) Analytics team creates clusters on per minute basis using the segregated
frames.
Alternative For frames which are not giving desired results (identified using confidence
paths intervals), they can be flagged out and manual tagging can be done.
Post- Automatic identification of characters or object in each frame for any of the
conditions relevant content that would eliminate long hours of manual work and aid the
teams to form clusters for further analysis.
Page | 9
Basic course a) Breakdown and analysis of weekly TRP ratings at per minute basis and
of events segregate the highly rated and low rated sections of the content.
b) Extracted frames/sections of the content according to their ratings will
be then looked for in the model output to get parameters like the
characters, their emotions, place, conversation etc.
c) Summarize the mapped data to form clusters that define highly rated and
low rated values of the parameters
d) Use this summary to get insight about the type of content that is
preferred and focus further creation of content on similar lines.
Alternative If the clusters change as per customer preferences, then the model automatically
paths gives out the insights and content team is made aware of the same.
Post- Weekly redundant job of manually assessing the TRP data and forming clusters
conditions is eliminated and the process is fully automated
Page | 10
Users Production Team
Preconditions Raw formatted content is available which then can be used to generate trailers
and precaps or even posters to be made available to the viewers.
Basic course a) Model analyzes the content to select best possible shots from the
of events content.
b) These shots are identified based on parameters like color balance, focus
on characters, emotions, best shot of characters etc.
c) It also gives the start and end of a shot to ensure maximum engagement
and avoid abrupt endings to the shot, something that is very difficult for
a human eye to interpret.
d) This aids the production team to have a robust and quick mechanism in
place that helps in marketing the content well.
Alternative Highlights of sports events also requires selection of best shots which can also
paths be done using this model.
Post- Better customer engagement, ensuring proper quality of content and compliance
conditions regulations of the content.
3.4.1 Scalability
The NLP and CNN model should be able to handle more and more data with time as usage data will be
keep flowing in at a rapid rate. Also, as the no. of shows increase over time the server for deployment
should be easy to scale and shouldn’t require reinstallation and backup.
3.4.2 Security
BARC review data is very sensitive and hence the infrastructure we use for storing their data should be
highly secure without any vulnerability
3.4.3 Capacity
Video and Audio generates large size data. The database and the cloud storage used must be large
enough to accommodate all the data required for the project
3.4.4 Maintainability
Once the models are built and deployed, the system needs to be continuously monitor for change in
accuracies and new developments. The ML models would have to be trained in periods to adhere to
new data generated and falling accuracy.
Page | 11
3.4.5 Reliability
This software will be developed with machine learning, feature engineering and deep learning techniques.
So, in this step there is no certain reliable percentage that is measurable.
Also, user provided data will be used to compare with result and measure reliability. The maintenance
period should not be a matter because the reliable version is always run on the server which allow users
to access summarization. When admins want to update, it take long as upload and update time of
executable on server. The users can be reach and use program at any time, so maintenance should not
be a big issue.
3.4.6 Supportability
The system should require C, Java, Python and Matlab knowledge to maintenance. If any problem
acquires in server side and deep learning methods, it requires code knowledge and deep learning
background to solve. Client-side problems should be fixed with an update, and it also require code
knowledge and network knowledge.
3.4.7 Usability
The system should be easy to use. The user should get the clustered output and inferences with one
button press if possible. Because one of the software’s features is timesaving. The system also should be
user friendly for admins because anyone can be admin instead of programmers. Training the
Autoencoders and classifiers are used too many times, so it is better to make it easy.
3.4.8 Performance
The capacity of servers should be as high as possible. Calculation time and response time should be as
low, because one of the software’s features is timesaving.
Page | 12