Professional Documents
Culture Documents
Image/Video Segmentation Using Bodypix Model: Department of Computer St. John College of Engineering and Management
Image/Video Segmentation Using Bodypix Model: Department of Computer St. John College of Engineering and Management
Department of Computer
St. John College of Engineering and Management
1. INTRODUCTION
Conforming to the entertainment field for decades, the green screen process is applied
mostly in everything whether it be a film or song. No matter where the cinematography
is seen, the green screen process goes in conjunction with filming. It is observed that
during the shooting or behind the scenes (BTS) of any entertainment clip; this process is
applied directly or indirectly. This method includes various arrangements in the
background. High- specified cameras, a bunch of technicians, and a lot of manpower
required to set up the stage during this method [1]. If any of the elements are misled;
then the green screen can lead to a mishap. For example, if the news reporter wears a
green-colored outfit on the set while explaining the weather report or a related thing then
the entire news in the foreground will come over her attire. So, there are many things
which should be considered while doing it. The green screen method auxiliary attaches a
special effect in the clip, but doing so requires a vast prop up of equipment and a variety
of people to function together. . This process seems to be easy by its name but as much
as the scene has a wow factor, the set-up of its green screen is complex; BTS of such
clips are the proof for it. Also, this is done by the one who has knowledge of the same
but our system is trying to avail this to a greater number of people with ease in the
process. Thus, TensorFlow made a pre-trained model with the help of JavaScript and
this system will make it available in a simpler way with the help of the website.
2. RELATED WORK
In 2004, G. Mori, Xiao Feng Ren, A.A. Efros, J. Malik developed Recovering
physical body Configurations: Combining, Segmentation, and Recognition which helped
us to know the structure of an object with a limb and torso detection method using pixel
recognition process. This model can identify a person's body from the given or selected
datasets and also help to get rid of the unwanted background from the image. This model
uses both processes thus far to take care of the accuracy of its model. The input receives
datasets that have various positions of various players. the ultimate output leads to
providing the position of the physical body with/without background removal; here
background removal isn't prioritized.
In 2016, Yi-Hsuan Tsai, Ming-Hsuan Yang, Michael J. Black developed a Video
segmentation via Object flow. Since Video Segmentation and optical flow are quite
difficult thanks to fast-moving objects, deforming shapes and cluttered backgrounds
and flow is
3. PROPOSED SYSTEM
A diagram could even be a diagram of a system during which the principal parts or
functions are represented by blocks connected by lines that show the relationships of the
blocks. Block diagrams are intended to clarify overall concepts without worrying for the
tiny print of implementation. In figure 1, the working of the system is diagrammatically
exhibited.
Input Realtime Feed/File: Here, the data that's video or photo is injected for the
further process. Data processing and Encoding: the foremost intention of the pre-
processing step is to figure out the planet of focus within the image. because the input
image may have a specific amount of noise, it is necessary to reduce or remove the
noise. Encoding the contents of a 2-D image during a raw bitmap (raster) format is
usually not economical and will end in very large files. Since raw image representations
usually require an outsized amount of space for storing and proportionally long
transmission times within the case of file uploads/ downloads, most image file formats
employ some kind of compression. Compression methods are often lossy when a
tolerable degree of degradation within the visual quality of the resulting image is
suitable, or lossless when the image is encoded in its full quality.
Feature Extraction: Feature extraction could also be a neighborhood of the
dimensionality reduction process, in which, an initial set of the info is split and reduced
to more manageable groups. So, once you would like to process it'll be easier. the
foremost important characteristic of those large data sets is that they need an outsized
number of variables. These variables require plenty of computing resources to process.
Feature extraction helps to urge the only feature from those big data sets by selecting
and mixing variables into features resulting in an efficient reduction of the amount of
data. These features are easy to process, but still able to describe the actual data set with
accuracy and originality.
Structural Mesh: Structural mesh generation is used for rendering to a monitor and for
physical simulation like finite element analysis or computational fluid dynamics. Meshes
are composed of straightforward cells like triangles, lines, etc. they're formed by
computer algorithms, often with human guidance through a Graphical interface (GUI),
relying on the complexity of the domain and thus the type of mesh desired. The goal is
to form a mesh that accurately captures the input domain geometry, with high-quality
also as well-shaped cells, and without numerous cells to make subsequent calculations
intractable. The mesh should even be fine therein its small elements in areas that are
important for subsequent calculations.
Body Segmentation: In computer vision, image segmentation refers to the technique
of grouping pixels during a picture into semantic areas typically to locate objects and
limits. Person segmentation segments an image into pixels that are a neighborhood of a
private and other people that are not. Under the hood, after a picture is fed through the
model, it gets converted into a two-dimensional image with float values between 0 and 1
at each pixel indicating the probability that the person exists therein pixel. a worth called
the “segmentation threshold” represents the minimum value a pixel’s score must get to
be considered a neighborhood of a private. Using the segmentation threshold, those 0 – 1
float values become binary 0’s or 1’s. Post-processing: to reinforce the segmented
image, further processing could even be required which is performed during this step.
Background/Person Removed: during this step, the required output is displayed.
Structural Mesh
Background
/ Person Removed Post Processing
Body Segmentation
4. EXPERIMENTAL RESULTS
The system is based on choice based according to the user. Here, the user has to feed the
file to perform a specific task; that is, whether the user wants to change the background
or the user wants to remove the person from a particular background. To get the result as
figure 2, the object should be in motion; the boy in the first canvas should be moving his
limbs to get identified as a moving object and to get the result as in the second canvas.
First Canvas Second Canvas
Person’s movement is shown in figure 3 so that system captures the background; these
are the snapshots from a video whose result is displayed in figure 2. In such a way this
system eradicates the object to exhibit the background; vice versa of this process is also
possible. Also, we can alter the background of a pre-existing image with the help of the
same as shown in figure 4.
REFERENCES:
6.1. Journal Article
[1] Infocusfilmschool.com ‘Filming with Green Screen: Everything you need to know’,
Julia Courtenay, 2018. [Online]. Available: https://infocusfilmschool.com/filming-
green- screen- guide/#:~:text=Green%20screen%20basically%20lets%20you,the
%20subject%2Factor
%2Fpresenter.&text=This%20lets%20the%20other%20image%20to%20show%20throug
h.
[4] G. Mori, Xiao Feng Ren, A.A. Efros, J. Malik, ‘Recovering Human Body
Configurations: Combining, Segmentation and Recognition’, July 19, 2004,
Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and
Pattern Recognition, CVPR 2004.
[5] Sergi Caelles, Kevis-Kokitsi Maninis, Jordi Pont-Tuset, Laura Leal-Taixe, Daniel
Cremers, Luc Van Gool; ‘One-Shot Video Object Segmentation’,2017, Proceedings of
the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 221-
230