You need to spend quite a lot of time familiarizing yourself with the data you have been given.
For background information, have you looked up the “Department for Business Innovation and Skills” on the internet?
The dataset seems to be organized a little like the VRC data we used in class. There is of course far more data in this set compared to the VRC data.
For the exercise in Week 6, you were given the questionnaire in pdf form. For the assignment, we are not given the questionnaire as such, but the questions are listed in one of the tabs (Variables). You were expected to discover this for yourself!
Have you looked at the Values tab? Can you see how this is linked to the Variables tab? Have you tried to investigate how these tabs are linked to the other tabs? As an example, the first variable in the Variables tab is SIC_CODE. This is also the first thing in the Values tab. Have you realized that the Values tab gives the possible answers for the Standard Industrial Classification? Have you found the possible responses on SIC_CODE in the Data tab? Do these match the possible values listed in the Values tab?
Have you looked for Standard Industrial Classification or SIC_CODE on the internet?
What is Q1_1A? (look in the Variables tab). What are the possible responses to this question? (look in the Values tab). Have you checked in the Data tab if these are the responses provided by the respondents? Can you check this more easily? One way is to set a filter so that you can see a list of all values in that column.
Have you tried to make sense of how the questions are grouped? As an example, what are questions Q3_2* about? How do they relate to question Q3_1?
What about Q3_3 and Q3_4*
The first thing to note from the brief is that you need to focus on the South East and North West regions. How can we find out about these 2 regions? One possibility is that one of the questions is related to region. Therefore, it might be worth looking for a variable/question on Region. Where do you think you can look for this? I would suggest looking in the Variables tab. What is this variable called? What are the possible values of this variable? Can you find this variable in any of the data tabs (Data, Q1-2, Q3, Q4-6)? Have you set a filter to check if the responses are the same as the values listed in the Values tab? What is the code for South East? What is the code for North West?
Have you extracted the data for the South East and placed it in a new tab? Have you done the same for the North West region? To do this, you can filter the data in the tab it is in and then copy the data that meets your requirement and paste it in a new tab.
As a starting point, have you tried to look at the breakdown between urban and rural for each of the 2 regions? You can do this by creating a Pivot Table for each of the regions.