Professional Documents
Culture Documents
4marks BA
4marks BA
MEASURES of CENTRAL TENDENCY A measure of central tendency is a number that represents the
typical value in a collection of numbers. Three familiar measures of central tendency are the mean,
the median, and the mode.
Mean = sum of all data points n (The mean is also known as the "average" or the "arithmetic
average.")
Median = "middle" data point (or average of two middle data points) when the data points are
arranged in numerical order.
Mode = the value that occurs most often (if there is such a value).
# Sample data
x = [1, 2, 3, 4, 5]
y = [2, 3, 5, 7, 11]
plt.scatter(x, y)
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Scatter Plot')
# Show plot
plt.show()
The plot will show points at coordinates (1, 2), (2, 3), (3, 5), (4, 7), and (5, 11), representing the data
from the x and y arrays.
3. steps involved in making plots
Import Libraries:
Prepare Data:
x = [1, 2, 3, 4, 5]
y = [2, 3, 5, 7, 11]
Create Plot:
plt.scatter(x, y)
Customize Plot:
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Scatter Plot')
# OR
Close Plot:
4.data visualization is better than the traditional text based data methods.
Data visualization means showing information with pictures instead of just words or
numbers. It's like when you look at a graph or chart instead of reading a long list of numbers.
Visuals help us understand things better because our brains like pictures. They make it easier
to see patterns, like trends going up or down, or groups of similar things.
Using pictures to show data makes it easier to talk about it with other people. Everyone can
understand what's going on, even if they're not experts.
It also helps us remember information better. Think about how you remember a picture you
saw compared to something you read.
When we see data in pictures, we can spot mistakes more easily. If something doesn't look
right, it's easier to notice when it's in a graph or chart.
So, data visualization is like telling a story with pictures, making it easier to understand,
remember, and share information.
6.box plot by calculating five number summary. Also, find the Inter Quartile Range
let's take a subset of 5 data points from the provided sample data:
{31.5,36.9,33.8,30.1,33.9}{31.5,36.9,33.8,30.1,33.9}
Min = 30.1
First Quartile (Q1): The median of the lower half of the dataset.
Q2 (Median) = 33.8
Third Quartile (Q3): The median of the upper half of the dataset.
Max = 36.9
So, for the subset of 5 data points, the five-number summary is:
Min = 30.1
Q1 ≈ 30.8
Q2 (Median) = 33.8
Q3 ≈ 35.4
Max = 36.9
○ Researchers purely consider the purpose of the study, along with the
understanding of the target audience.
○ Researchers apply when the subjects are difficult to trace. For example,
surveying shelterless people or illegal immigrants will be extremely
challenging. In such cases, using the snowball theory, researchers can
track a few categories to interview and derive results. Researchers also
implement this sampling method when the topic is highly sensitive and
not openly discussed—for example, surveys to gather information
about HIV Aids. Not many victims will readily respond to the questions.
Still, researchers can contact people they might know or volunteers
associated with the cause to get in touch with the victims and collect
information.
● Quota sampling:
Random Partitioning: Imagine throwing your data into different groups without any
particular order or reason. It's like putting names into hats and picking which hat each name
goes into randomly.
Hash Partitioning: Think of a secret code that each piece of data gets. Data with the same
code goes into the same group. This is useful for spreading data evenly across different
places.
Range Partitioning: Suppose you have a big list of numbers, and you want to split them into
groups based on how big or small they are. You might put numbers between 1 and 10 in one
group, and numbers between 11 and 20 in another group, and so on.
Round-robin Partitioning: This is like taking turns. Each piece of data gets put into a different
group, one after the other, in a circle. It's simple and fair but might not be the best for
organizing the data.
Key-based Partitioning: Imagine sorting your data based on a special key, like the first letter
of a name. All the names starting with "A" go in one group, "B" in another, and so on. It's
helpful for quickly finding specific pieces of data.
List Partitioning: Think of having lists of specific things and putting each piece of data into the
list it belongs to. For example, you might have lists for different types of fruits, and each fruit
goes into its respective list.
Composite Partitioning: This is like using two or more methods together. You might first
organize your data by size and then by color, so you have small red things together, large blue
things together, and so on.
Vertical Partitioning: Imagine splitting your data into columns. Each column goes into its own
group. It's useful when some columns are accessed more often than others, so you can
manage them separately.
Horizontal Partitioning: Picture cutting your data into rows. Each piece of data goes into its
own row, and rows are grouped together. It's like dividing your data into chunks, which can
be spread out across different places for easier handling.
The Balas Additive Algorithm is a method used to solve linear binary optimization problems.
These problems involve maximizing or minimizing a linear objective function subject to
constraints, where the decision variables are binary (taking on values of either 0 or 1).
Initialization: Start with an initial set of constraints, which typically include any equality
constraints and non-negativity constraints.
Iterative Improvement:
At each iteration, solve a relaxed version of the problem where the binary variables are
treated as continuous variables (i.e., allowing fractional values between 0 and 1).
Identify a violated constraint in the relaxed solution. This is a constraint that is not satisfied
by the current solution.
Add the violated constraint to the problem, making it more restrictive.
Repeat this process until no violated constraints are found, indicating an optimal solution.
Optimal Solution:
Once no violated constraints are found in the relaxed solution, the current solution is optimal
for the original binary optimization problem.
If fractional values are obtained for any variables, round them to the nearest integer (0 or 1)
to obtain a feasible binary solution.