Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

Instructions for Group Assignment # 1

The datasets for this group assignment are two CSV les named
selectcollections1.csv and selectcollections2.csv.

This is an "individual" assignment (although your grade will be


determined by the average performance of the members of your
group).

I want you to upload an Excel le here--I will provide it to you--(XLSX


format).

I want you to name this le with your full student ID number; for
example 1234.xlsx (substituting 1234 with your student ID number).

In this assignment you are going to do three things for me:

1) You are going to forecast how much can be collected or


recovered from each account in the new dataset.
2) You are going to select which accounts you want to buy
(compro=1 means you want to buy, compro=0 means you do not
want to buy) from all the accounts contained in that le.
3) You are going to, based on the R-squared and RMSE metrics,
"forecast" how useful you think your best model will be for the
company.

You may purchase as many accounts as you wish for $850 each. This
value is the average of totalpay in the new dataset. You are not
obligated to buy them all. In fact you are not obligated to buy any of
them if you don’t want to or you can buy them all.

You are going to be evaluated on three metrics related to the 3 points


above:

1) How close your forecasts are to the true values of totalpay on the
new dataset (I’ll use RMSE).
fi
fi
fi
fi
2) Most importantly: How much pro t your selection creates for the
company SelectCollectionsInc.
3) How reasonable is your argument for part 3.

In the excel sheet provided on the course page I want 3 columns:

1) the identi er (id) -> cells a2:a3751.


2) your forecast of how much you can recover from each of the 3750
accounts -> cells b2:b3751
3) a binary variable (1 or 0) indicating which accounts you wish to
buy. The 1 indicates that you want to buy it. 0 indicates that you do
not want to buy it -> cells c2:c3751.

In cells E2 and E3 I want the R-squares and RMSE for your best
model.

Make sure that in the cells there are numbers and not formulas. This
work is corrected by a program that automates the correction of your
answers.

The answer to question 3 will be uploaded as a PDF le named the


same way as the XLSX.
fi
fi
fi

You might also like