Statistical Machine Learning For Data Science Lab (Integrated) BAD702
Course Code: BAD702
Credits: 04
CIE Marks: 50
SEE Marks: 50
Total Marks: 100
Exam Hours: 03
Teaching Hours/Weeks: [L:T:P:S] 3:0:2:0
Note: It’s recommended to run this program in PyCharm IDE for the best experience and easier execution of Python programs.
A dataset contains the prices of houses in a city. Find the 25th and 75th percentiles and calculate the interquartile range (IQR). How does the IQR help in understanding the price variability?
You are given a dataset with categorical variables about customer satisfaction levels (Low, Medium, High) and whether customers made repeat purchases (Yes/No). Create visualizations such as bar plots or stacked bar charts to explore the relationship between satisfaction level and repeat purchases. What can you infer from the data?
A dataset contains information about car models, including the engine size (in Liters), fuel efficiency (miles per gallon), and car price. Use a pair plot or correlation matrix to explore the relationships between these variables. Which variables seem to have the strongest relationships, and what might be the practical significance of these findings?
You want to estimate the mean salary of software engineers in a country. You take 10 different random samples, each containing 50 engineers, and calculate the sample mean for each. Plot the distribution of these sample means. How does the Central Limit Theorem explain the shape of this sampling distribution, even if the underlying salary distribution is skewed?
A researcher conducts an experiment with a sample of 20 participants to determine if a new drug affects heart rate. The sample has a mean heart rate increase of 8 beats per minute and a standard deviation of 2 beats per minute. Perform a hypothesis test using the t-distribution to determine if the mean heart rate increase is significantly different from zero at the 5% significance level.
A company is testing two versions of a webpage (A and B) to determine which version leads to more sales. Version A was shown to 1,000 users and resulted in 120 sales. Version B was shown to 1,200 users and resulted in 150 sales. Perform an A/B test to determine if there is a statistically significant difference in the conversion rates between the two versions. Use a 5% significance level.
You are comparing the average daily sales between two stores. Store A has a mean daily sales value of $1,000 with a standard deviation of $100 over 30 days, and Store B has a mean daily sales value of $950 with a standard deviation of $120 over 30 days. Conduct a two-sample t-test to determine if there is a significant difference between the average sales of the two stores at the 5% significance level.
A company collects data on employees’ salaries and records their education level as a categorical variable with three levels: “High School”, “Bachelor’s”, and “Master’s”. Fit a multiple linear regression model to predict salary using education level (as a factor variable) and years of experience. Interpret the coefficients for the education levels in the regression model.
You have data on housing prices and square footage and notice that the relationship between square footage and price is nonlinear. Fit a spline regression model to allow the relationship between square footage and price to change at 2,000 square feet. Explain how spline regression can capture different behaviours of the relationship before and after 2,000 square feet.
A hospital is using a Poisson regression model (a type of GLM) to predict the number of emergency room
visits per week based on patient age and medical history. The model is given by:
⦿ Log(λ) =2.5-0.03*Age+0.5*condition
where λ is the expected number of visits per week, Age is the patient’s age, and condition is a binary
variable (1 if the patient has a chronic condition, 0 otherwise).
Interpret the coefficients of Age and condition.
⦿ What is the expected number of visits per week for a 60-year-old patient with a chronic condition?
⦿ How would the expected number of visits change if the patient did not have a chronic condition?
Music Player App: Create a wireframe, Design and prototype the pages with a background and a Rollover button, and Song selection Page with a Home Rollover button. The third page may include animated play and pause button, play music animation, timer animation.