|Question||The aim of this assignment is to investigate and visualise data using various data science tools. It will test your ability to: 1. read data files in Python and extract related data from those files; 2. wrangle and process data into the required formats; 3. use various graphical and non-graphical tools to performing exploratory data analysis and visualisation; 4. use basic tools for managing and processing big data; and 5. communicate your findings in your report. You will need to submit two separate files (Note: Submitting a zipped file will attract penalty of 10%): 1. A report in PDF containing your answers to all the questions. Note that you can use Word or other word processing software to format your submission. Just save the final copy to a PDF before submitting. Make sure to include code, the output and any screenshots/images of the graphs you generate in order to justify your answers to all the questions. (Marks will be assigned to reports based on their correctness and clarity. — For example, higher marks will be given to reports containing graphs with appropriately labelled axes.) 2. The Python code is a Jupyter notebook file (idnumber_FIT5145_A1.ipynb) that you wrote to analyse and plot the data. (Note that the entire assignment should be completed using python) Assignment Tasks: The way we supply and use energy in Australia is changing. To understand these changes, to plan for Australia’s energy future, and to make sound policy and investment decisions, we need timely, accurate, comprehensive and readily-accessible energy data. The Department of Industry, Science, Energy and Resources is responsible for compiling and publishing Australia’s official energy statistics and balances . 1 The is updated annually and consists of historical energy consumption, production and trade statistics. In this task, you are required to explore the statistics covering all electricity generation in Australia. This includes by power plants, and by businesses and households for their own use, in all states and territories. This also includes both on and off grid generation. We have extracted the data from the original files and restricted it to a specific time period. download the dataset for this assignment from the following link:|
Order the answer to: The aim of this assignment is to investigate and visualise data using various data science tools….
Order the answer to: demonstrate your understanding and mastery of programming in Python using data science tools,…
|Question||In this project, you will demonstrate your understanding and mastery of programming in Python using data science tools, in addition to your understanding of the different research methods that use data science. What you learnt so far should cover almost everything you will need, so if you are stuck then read through the notebooks again. Some of your tutorials and exercises may be also relevant. For Problem 4, you may need to consult the “Bit by Bit” book1 . If you are still unsure, then have a look online. Google and Stack OverFlow are your friends. You must use Python only; MATLAB and Excel are not acceptable. For Problem 3, using command line for preprocessing is acceptable, but you will need to provide your script in a .sh file! The grade of this project will be calculated out of 100, and it will contribute 70% towards your overall grade in the module. The following aspects need to be shown: ? Manipulation of different data structures including dictionaries and data frames. ? Preparing and preprocessing data. ? Doing a basic plot, and changing plot markers, colors, etc. ? Improving and extending analysis. ? Showing an understanding of different research methods and designs for using data science. ? Showing the ability to critique some scientific approaches. Your submission will be a compressed file (.zip) containing: 1. A copy of your Python script named solution_code.ipynb (done in Jupyter Notebook). Your scripts must be sufficient to reproduce your answers to Problems 1-3. 2. A PDF file named P4_answers.pdf that contains your answers to Problem 4. The deadline is Tuesday 28th April at 3:00 PM (UTC).|
Order the answer to: Write three Python functions that accept a list of string items as parameter. The first function…
|Question||Write three Python functions that accept a list of string items as parameter. The first function should return the first item in the list whose length is divisible by 3. The second function should return the last item in the list with more than 4 characters. The third function should replicate the list by 3 and output one entry per line all the items with odd number of characters. In the main program, interactively create a list of 5 string items (words, sentences etc) of varying length. Present the user with a set of menu to select from. If the user entersthe character ‘F’ or ‘f’, the program should output the first item whose length is divisible by 3. When the user enters the character ‘L’ or ‘l’, the program should output the last item with more than 4 characters. When the user enters any number, the program should perform the replication and output the items with odd number of characters. The program should go back to the menu options for new selection. When the user enters ‘E’ or ‘e’, the program should terminate with a goodbye messag|
Order the answer to: Pairwise relationships are prevalent in real life. For example, friendships between people,…
|Question||Pairwise relationships are prevalent in real life. For example, friendships between people, communication links between computers and pairwise similarity of images. Networks provide a way to represent a group of relationships. The entities in question are represented as network nodes and the pairwise relations as edges. In real network data, there are often missing edges between nodes. This can be due to a bug or deficiency in the data collection process, a lack of resources to collect all pairwise relations or simply there is uncertainty about those relationships. Analysis performed on incomplete networks with missing edges can bias the final output, e.g., if we want to find the shortest path between two cities in a road network, but we are missing information of major highways between these cities, then no algorithm will able to find this actual shortest path. Furthermore, we might want to predict if an edge will form between two nodes in the future. For example, in disease transmission networks, if health authorities determine a high likelihood of a transmission edge forming between an infected and uninfected person, then the authorities might wish to vaccinate the uninfected person. In this way, being able to predict (and correct for) missing edges is an important task. Your task: In this project, you will be learning from a training network and trying to predict whether edges exist among test node pairs. The training network is a fragment of the academic co-authorship graph. The nodes in the network—authors— have been given randomly assigned IDs, and an undirected edge between node A and B represents that authors A and B have published a paper together as co-authors. The training network is a subgraph of the entire network, focussing on individuals in a specific academic subcommunity. The test data is a list of 2,000 edges, and your task is to predict if each of those test edges are really edges in the authorship network or are fake ones. 1,000 of these test edges are real and withheld from the training network, while the other 1,000 do not actually exist. To make the project fun, we will run it as a Kaggle in-class competition. Your assessment will be partially based on your final ranking in the privately-held competition, partially based on your absolute performance and partially based on your report.|
Order the answer to: At the end of Day 1, a snapshot of all contents on the platform is taken. Data in the snapshot is…
|Question||This assignment implements a strategy that makes it impossible for someone to alter the contents of a public platform at a later time. The public platform can be a discussion forum, a bulletin board or a social network such as FaceBook and Twitter. The strategy uses a one-way hash such as SHA256 in Python’s “hashlib” and works like this: 1. At the end of Day 1, a snapshot of all contents on the platform is taken. Data in the snapshot is then hashed using SHA256. The outcome of the hash computation, H1, is published to the entire world, such as by printing it on the next day’s New York Times. 2. At the end of Day 2, a snapshot of all contents on the platform is taken. Data in the snapshot, together with the hash of the previous day H1, is then hashed using SHA256. The outcome of the hash computation, H2, is published to the entire world, such as by printing it on the next day’s New York Times. 3. At the end of Day 3, a snapshot of all contents on the platform is taken. Data in the snapshot, together with the hash of the previous day H2, is then hashed using SHA256. The outcome of the hash computation, H3, is published to the entire world, such as by printing it on the next day’s New York Times. 4. Likewise, the same operation is applied at the end of Day 4, 5, ……. The chain of hash values H1, H2, H3, ……, together with the chain of snapshots, forms an immutable record for the public platform. Your task is to implement the strategy using Python, especially SHA-256 in “hashlib”. To simplify your implementation, you may make the following shortcuts: 1. You may use a flat directory of files to simulate a snapshot of contents in a public platform. Use Merkle hash tree to build a hash of the snapshot. Note that the number of files and their contents may vary from snapshot to snapshot, and the order in which files are hashed is important. 2. You may use a web site or a simple public file to simulate New York Times. 3. You may use an all-0 H0 as the hash value of the snapshot of a non-existent Day 0. Submit 1. Your code 2. A report a. detailing your design and implementation of the code b. screenshots of test runs c. discussions on i. why the system you build is immutable ii. vulnerability of your system iii. pros and cons of shortened time intervals between snapshots d. your implementation may be further improved by taking into consideration the following aspects. Discuss how you might implement these and other improvements you can think of, if you have time later. i. using digital signature, ii. allowing a snapshot to be composed of a hierarchy of nested directories each of which may contain both files and sub-directories. 3. Slides for presentation Presentation Students will have an opportunity to present their work to the entire class.|
Order the answer to: Write a program to manage a Books store. The book store should have minimum 20 different books….
|Question||Problem Description: Write a program to manage a Books store. The book store should have minimum 20 different books. All the book data should be collected from file (like book title, authors, publisher, publication year, price, max discount rate etc.), some data like book order details can be collected through interactive way (like keyboard). From user you will take following inputs (you can get the input from a file or through the keyboards)- 1. Customer information a. Name b. Address c. Telephone 2. Order of the targeted Book (ex. 10 copies of “Harry potter and the order of the phoenix”.) 3. For each you should have following information(minimum) a. Name/Title of the book b. Author/s c. Publisher d. Year of publication e. Type of publication format (i.e. Hardcover, paperback etc.) f. Price per unit i. Original price ii. Selling price g. Discount rates h. Book ID (like ISBN number) i. Postal charge 4. Special offers. on some specific books. You have to produce following outputs – 5. Create a receipts for the whole order (a customer can buy multiple copies of multiple books) 6. Present some statistical reports a. Total sell (Should consider orders from minimum 5 different customers) b. Total profit/Loss report c. Most popular book of the day d. Highest selling (in terms of price) book of the day. e. Publishers ranking based on sell. 7. Show the book ranking based on the actual profits.|
Order the answer to: Prompt the user for the country which will be mined. If the user chooses to not provide this…
|Question||1. Prompt the user for the country which will be mined. If the user chooses to not provide this information, then assume a default search of the United States. Try to make your communication with a user as friendly as possible, that is, the least restrictive to how user should enter countries. E.g. no difference for small/large caps, accept some common abbriviations, like US or USA for United States, or UK for United Kingdom. If an illegal value is entered (e.g. ‘new transavia’ for country), you can ask again or try to fix it – google for the Levenshtein distance. Then ask user to confirm your fix or change it to the right one. If your program fails to fix the illegal value for country name, then do not include it in the data loading routine. You may wish to use a text list of all countries in the world to define valid countries. Note that the We Feel Fine data set does not necessarily cover all of the countries in this list. don’t be overwhelmed with complexity of this part, start with basic prompt and then gradually increase functionality. Suggested features are desirable but not compulsory. 2. Allow the user a maximum of 5 countries to be successfully mined, although they are also allowed to enter less than 5 countries. Load corresponding data files from the folder countries. Successful mining occurs when the feelings for each country have been recorded and returned to your program. 3. For each feeling in the full list of over 5000 feelings and their frequencies determine the number of times each feeling appears in the mined text, for each country. For any counts that are larger than 0, you will need to retain the third column of information which is the hexadecimal equivalent of the colour of the prescribed feeling. 4. For each country, produce a plot of ellipses where each ellipse represents a feeling and have size proportional to the frequency of its occurrence and is coloured based on the full list of feelings referenced above. Ellipse position can be random. The code for this component is provided and explained below, however you will need to make a number of adjustments to it. 5. Run the base query of data file World.txt to determine the first 1500 feelings mined by We Feel Fine from anywhere in the world. We will compare these mined feelings with the chosen countries. There is|
Order the answer to: In this project, you will work individually to write programs which demonstrate your…
|Question||In this project, you will work individually to write programs which demonstrate your understanding of IPO and usage of simple functions in Python programs. Content and Structure: You will have to write simple programs to: 1. Accept inputs from user 2. Perform mathematical operations to process data entered by user 3. Print the output 4. Use simple functions which passes on user inputs to a function which performs the operation and returns the result which will be displayed on the console Program expectation: ? The student must be able to explain the working of the program and its logic. ? Program should be indented, proper comments should be given, modification history should be present, variable names and data types should be chosen appropriately. ? The program should compile and execute to display the result. ? The student must use programming constructs available in Python and follow coding standards.|
Order the answer to: Draw a flowchart that presents the steps of the algorithm required to perform the task specified….
|Question||————————————————- Reward 4 Shop Points and Discount Calculator ————————————————- Enter amount spent on regular items : 85.75 Enter amount spent on special items : 70.5 ————————————————- Total reward points earned : 226 Subtotal : $156.25 Discount : $3.125 Total due : $153.125 ————————————————- RATIONALE back to top MARKING CRITERIA AND STANDARDS back to top PRESENTATION back to top REQUIREMENTS back to top|
Order the answer to: Consider the following macroeconomic model for real business cycles taken from Dejong and Dave…
|Question||Linear algebra 1. Real business cycles: Consider the following macroeconomic model for real business cycles taken from Dejong and Dave (2011) Chapter 3: Suppose that we have a representative household whose goal is to maximize their utility from consumption and leisure over time:|