Starting from:

$30

ITE351 AI & Applications Midterm Exam -Solved

 
READ INSTRUCTIONS VERY CAREFULLY! 
You are doing ML analysis for trading cryptocurrency to make a profit. Our target cryptocurrency 
is one of popular ones called BitCoin (BTC). KRW below means Korean Won. 
*2019-05-trade.csv data format: This file contains the trading history of BTC-KRW market 
during May 16-31, 2019. It’s a basically a file showing the sequence of Buy and Sell made for 
BTC. And it may be possible to use this data as training/test dataset for smart transaction agent 
for cryptocurrency. 
- timestamp: time of event in place, yyyy-mm-dd HH:MM 
- quantity: 
BTC size in trade 
- price: 
1 BTC price in KRW 
- fee: 
ignore this 
- amount: 
the total amount of KRW in trade, quantity * price 
- side: 
0 means Buy (also known as ‘Bid’), 1 means Sell (also known as ‘Ask’) 
Here is one example from May 16, 2019. 
… 
2019-05-16 13:40, 0.20, 9549000, 954.90, 1908845, 1 
… 
At May-16-2019 13:40, 0.20 BTC was sold (side: 1) at the price of 9,549,000 KRW. 
*2019-05-17-BTC-orderbook.csv data format: This file contains so-called orderbook of BTC 
market. The data has the historical records of willingness to Buy and Sell BTC for every second. 
Every second data have multiple lines of Buy and Sell. The first several lines represent Buy 
requests, the next several lines represent Sell requests. 
- price: 
1 BTC price in KRW 
- quantity: 
BTC size is willing to Buy or Sell. 
- type: 
0 means Buy (‘Bid’), 1 means Sell (‘Ask’) 
- timestamp: 
time of market, yyyy-mm-dd HH:MM:SS.us the following (orderbook data) example is from May 17, 2019. 
… 
9435000.0, 1.6979, 0, 2019-05-17 00:00:00.962338 
ß Top Level Buy (top_bid_price) 
9434000.0, 0.0015, 0, 2019-05-17 00:00:00.962338 
ß Level 2 Buy 
9433000.0, 0.0018, 0, 2019-05-17 00:00:00.962338 
ß Level 3 Buy 
9431000.0, 0.0475, 0, 2019-05-17 00:00:00.962338 
9430000.0, 0.8173, 0, 2019-05-17 00:00:00.962338 
9450000.0, 24.1714, 1, 2019-05-17 00:00:00.962338 
ß Top Level Sell (top_ask_price) 
9455000.0, 0.2023, 1, 2019-05-17 00:00:00.962338 
ß Level 2 Sell 
9458000.0, 0.0112, 1, 2019-05-17 00:00:00.962338 
ß Level 3 Sell 
9459000.0, 0.3099, 1, 2019-05-17 00:00:00.962338 
9462000.0, 1.3064, 1, 2019-05-17 00:00:00.962338 
… 
At Top Level Buy, someone wants to buy 1.6979 BTC at the price of 9,435,000 KRW in 
May-17-2019 00:00:00 (right at the midnight). At Top Level Sell, someone wants to sell 
24.1714 BTC at the price of 9,450,000 KRW. So these datasets show the buy and sell 
requests in the market. Somebody can make some money! 
*Task 1: A file you need to complete Task 1: 2019-05-trade.csv. Compute the total profit of May 
in KRW. It simply means how much money do we make or lose? To calculate the exact profit 
over the days, you should be calculating when the accumulative quantity is close to 0. The 
accumulative quantity infers the moment that the number of quantities bought and sold are equal 
(the difference between them is close to 0). Only consider 4-digit floating number, ignore the rest 
when you are calculating. Show the difference between how much KRW we spent to buy and sell 
at that moment(s). This is the “exact profit” of the 2019-05-trade. 
Or there is a simpler way to calculate “approximate profit” using ‘amount’ column. You can 
easily figure this one out. If your answer is to show “exact profit” and its process, a full mark 
(100%) will be given for task 1. If your answer is to show “approximate profit”, an 80% will be 
given. 
Show me your codes. If you like to explain your steps, please do so. Write them down clearly or 
your concerns. 
*Task 2: A file you need to complete Task 2: 2019-05-trade.csv. Report how many Buy and Sell 
trades separately. Draw a daily time-series bar graph illustrating changes in transaction counts (x
axis: days, y1-axis: Sell, y2-axis: Buy). The following sample graph is drawn hourly. Your graphs 
would look similar but in days. *Task 3: Files you need to complete Task 2: 2019-05-trade.csv and 2019-05-17-BTC
orderbook.csv. Compute the following features and modify 2019-05-trade.csv. Show the first 20 
and last 20 lines of your new csv data. For the new csv data, you will remove a few existing 
columns and add three new columns: MidPrice, Bfeature, and Alpha. In order to compute these 
three columns, check out the following: 
*How to compute MidPrice: ask means Sell, bid means Buy. 
MidPrice = (top_ask_price + top_bid_price) / 2 
*How to compute Bfeature 
 
askQty = orderbook_ask_quantity.avgerage() # average quantity of all levels for Sell (side 1) 
bidQty = orderbook_bid_quantity.avgerage() # likewise for Buy (side 0) 
bidPx = orderbook_bid_price.avgerage()
 # average price of all levels for Buy (side 0) 
 
book_price = (askQty*bidPx)/bidQty 
 
Bfeature = (book_price - mid_price) 
*How to compute Alpha: 
Alpha = Bfeature * MidPrice 
 
* Your new 2019-05-trade.csv data format will be: 
timestamp, quantity, price, midprice, bfeature, alpha, side 
For example, the first line of 2019-05-17 in 2019-05-trade.csv: 
2019-05-17 00:00, 0.05770069, 9449000, 272.60, 544941, 1 
It happened at 2019-05-17 00:00. So use the very first sec dataset at 00:00:00.962338 from 
2019-05-17-BTC-orderbook.csv. Use the following dataset to compute midprice, bfeature, 
alpha for 2019-05-17 00:00 and add them in column to your new 2019-05-trade.csv file. 
9449000.0, 40.39262446, 0, 2019-05-17 00:00:00.962338 
9448000.0, 0.17676982, 0, 2019-05-17 00:00:00.962338 
9446000.0, 2.11887902, 0, 2019-05-17 00:00:00.962338 
9445000.0, 0.02805574, 0, 2019-05-17 00:00:00.962338 
9441000.0, 0.74081055, 0, 2019-05-17 00:00:00.962338 
9440000.0, 0.2338522, 0, 2019-05-17 00:00:00.962338 
9439000.0, 0.01637451, 0, 2019-05-17 00:00:00.962338 
9438000.0, 0.29291399, 0, 2019-05-17 00:00:00.962338 
9437000.0, 0.48892999, 0, 2019-05-17 00:00:00.962338 
9436000.0, 0.11283857, 0, 2019-05-17 00:00:00.962338 
9435000.0, 1.69795923, 0, 2019-05-17 00:00:00.962338 
9434000.0, 0.00155999, 0, 2019-05-17 00:00:00.962338 
9433000.0, 0.00189399, 0, 2019-05-17 00:00:00.962338 
9431000.0, 0.04756431, 0, 2019-05-17 00:00:00.962338 
9430000.0, 0.81736619, 0, 2019-05-17 00:00:00.962338 
9450000.0, 24.17141283, 1, 2019-05-17 00:00:00.962338 
9455000.0, 0.20238048, 1, 2019-05-17 00:00:00.962338 
9458000.0, 0.01123548, 1, 2019-05-17 00:00:00.962338 
9459000.0, 0.30999, 1, 2019-05-17 00:00:00.962338 
9462000.0, 1.30642, 1, 2019-05-17 00:00:00.9623389465000.0, 0.44874552, 1, 2019-05-17 00:00:00.962338 
9466000.0, 1.0, 1, 2019-05-17 00:00:00.962338 
9473000.0, 0.31419, 1, 2019-05-17 00:00:00.962338 
9475000.0, 3.061, 1, 2019-05-17 00:00:00.962338 
9480000.0, 1.29058544, 1, 2019-05-17 00:00:00.962338 
9481000.0, 0.29193, 1, 2019-05-17 00:00:00.962338 
9482000.0, 0.0517347, 1, 2019-05-17 00:00:00.962338 
9483000.0, 0.488, 1, 2019-05-17 00:00:00.962338 
9490000.0, 0.08680052, 1, 2019-05-17 00:00:00.962338 
9491000.0, 0.699, 1, 2019-05-17 00:00:00.962338 
Search for the next trade row, 2019-05-17 00:02, and find the 2019-05-17 00:02:00 in the 
orderbook data file. And compute each feature again. You are repeating this for every 
timestamp in 2019-05-17. 
A full mark will be given if you find the corresponding timestamps in the trade and orderbook 
files and add the midprice, bfeature, and alpha to your new trade data. Only for the timestamps 
that exists in the trade file! Some timestamps are missing in the orderbook data file, then just 
omit them (Just fill 0). 
*(BONUS) Task 4: How can you use the data file from Task 3 to create the smart trading agent? You 
can show how to create train and test datasets. And show how to use ML (using R or python random
forest exercise from our class or others like PCA, anything that you like) to create the learning agent 
for cryptocurrency transaction. What’s going to be an appropriate target feature here and what about 
the remaining features? If you can provide a code and running example (by attaching the screenshot), 
that will be the best. Explanation in words is also okay. Be clear and show code if you can (If you 
want, you can explain in the code comment lines). Accuracy does not matter. Take a good look at the 
samples that we have seen in the class, perhaps use them wisely. 
For example, 
- show how to manipulate the dataset (reading, processing, etc.) 
- show how to split training/test dataset 
- show how to train or make the model 
- show how to use the built model to test 

More products