Mid term | Computer Science homework help

  

Please answer the following questions:

1. As the Big Data ecosystem takes shape, there are four main groups of players within this interconnected web. List and explain those groups.

2. How the data science team evaluate whether the model is sufficiently robust to solve the problem or not? What are the questions that they should ask?

3. Explain the differences between Hexbinplot and Scatterplot and when to use each one of them.

4. k-means does not handle categorical data?

5. local retailer has a database that stores 10,000 transactions of last summer. After analyzing the data, a data science team has identified the following statistics:

● {battery} appears in 6,000 transactions.

● {sunscreen} appears in 5,000 transactions.

● {sandals} appears in 4,000 transactions.

● {bowls} appears in 2,000 transactions.

● {battery, sunscreen} appears in 1,500 transactions.

● {battery, sandals} appears in 1,000 transactions.

● {battery, bowls} appears in 250 transactions.

● {battery, sunscreen, sandals} appears in 600 transactions.

Answer the following questions:

a. What are the support values of the preceding itemsets?

b. Assuming the minimum support is 0.05, which itemsets are considered frequent?

6. Linear regression is an analytical technique used to model the relationship between several input variables and a continuous outcome variable. Linear regression can be used in business, government, and medical. Explain by example how it can be used in those domains.

7. Which classifier is considered computationally efficient for high-dimensional problems? Why?

8. Define the following time series components:

● Trend

● Seasonality

● Cyclic

● Random

Need your ASSIGNMENT done? Use our paper writing service to score better and meet your deadline.


Click Here to Make an Order Click Here to Hire a Writer