I recently subscribed to pluralsight and needed to access the videos out of their walled app. A quick look at lsof when pluralsight is running shows that it's accessing a cached video in my ~/Library.
A quick look at the psv file using the file command fails. Using hexdump doesn't reveal any magic strings. Time to look for other clues. Next step was to sign in onto their website and play videos using the browser and at the same time take a tcpdump..
Comparing the first few bytes from the website and from the PSV reveals something interesting
Aha, it looks like we have a simple XOR obfuscation with a key 0x65. Can we write a simple de-obfuscation script?
You bet!
Now I can play my learning videos on my TV w/out hooking up my laptop.
Monday, August 7, 2017
Tuesday, January 26, 2016
Cost of Petrol in Kenya
In [27]:
import os
import pandas
import ggplot
import requests
import matplotlib
%matplotlib inline
import dateutil.parser
from datetime import datetime
from bs4 import BeautifulSoup
from Pastebin import PastebinAPI
In [2]:
# Brent crude oil data source: http://www.eia.gov/dnav/pet/hist/LeafHandler.ashx?n=PET&s=RBRTE&f=D
home_path=os.environ['HOME']
crude_csv_file = os.path.join(home_path,"devel/ipython/data/Europe_Brent_Spot_Price_FOB.csv")
In [3]:
lines_to_skip = 5
names = ['day', 'price']
crude_df = pandas.read_csv(crude_csv_file, names=names, skiprows=lines_to_skip)
# Create a datetime field
crude_df['date'] = crude_df.day.map(lambda x: dateutil.parser.parse(x))
# Normalise price with 100% being the peak price
crude_df['crude_normed_price'] = crude_df.price / max(crude_df.price)
# Re-order the dataframe with the oldest records first
crude_df = crude_df.sort(ascending=False)
crude_df.head()
/Users/mickeymouse/venvs/default/lib/python2.7/site-packages/ipykernel/__main__.py:13: FutureWarning: sort(....) is deprecated, use sort_index(.....)
Out[3]:
day | price | date | crude_normed_price | |
---|---|---|---|---|
7273 | 05/20/1987 | 18.63 | 1987-05-20 | 0.129420 |
7272 | 05/21/1987 | 18.45 | 1987-05-21 | 0.128170 |
7271 | 05/22/1987 | 18.55 | 1987-05-22 | 0.128864 |
7270 | 05/25/1987 | 18.60 | 1987-05-25 | 0.129212 |
7269 | 05/26/1987 | 18.63 | 1987-05-26 | 0.129420 |
In [4]:
crude_df[['date', 'price']].plot()
Out[4]:
<matplotlib.axes._subplots.AxesSubplot at 0x115a902d0>
In [48]:
g = ggplot.ggplot(crude_df, ggplot.aes('date', 'price')) + ggplot.geom_line()
g = g + ggplot.scale_x_date(labels = ggplot.date_format("%Y-%m-%d"), breaks=ggplot.date_breaks('9 months'))
g = g + ggplot.theme(axis_text_x = ggplot.element_text(angle = 90)) + ggplot.ggtitle("Brent Crude Price")
g
Out[48]:
<ggplot: (297029425)>
In [49]:
g = ggplot.ggplot(crude_df, ggplot.aes('date', 'crude_normed_price')) + ggplot.geom_line()
g = g + ggplot.scale_x_date(labels = ggplot.date_format("%Y-%m-%d"), breaks=ggplot.date_breaks('9 months'))
g = g + ggplot.theme(axis_text_x = ggplot.element_text(angle = 90)) + ggplot.ggtitle("Normalized Brent Crude Price")
g
Out[49]:
<ggplot: (297251469)>
In [7]:
# Fetch kenya pump prices from the ERC
# Source: http://www.erc.go.ke/index.php?option=com_content&view=article&id=162&Itemid=666
erc_url = 'http://www.erc.go.ke/index.php?option=com_content&view=article&id=162&Itemid=666'
def get_available_erc_data_dates(erc_data_page_url):
'''
The data published by ERC is accessed using an index for a 'month' date_range.
We need to prefetch all the available date indices.
'''
available_dates = []
resp = requests.get(erc_data_page_url)
erc_content = BeautifulSoup(resp.content)
erc_data_table = erc_content.find(name='div', attrs={'class':"item-page"}).table
for option in erc_data_table.find('select').findAll():
available_dates.append({
'date_range': option.attrs.get('value'),
'value': option.text,
'datetime':dateutil.parser.parse(option.text.split('-')[0])
}
)
return available_dates
def fetch_erc_data_for_date_range(erc_data_url, date_range, date_range_datetime):
'''
ERC has a form that you post to a custom date range that spans a month's worth of time
We need to parse the html content and extract the data in html tables
'''
resp = requests.post(erc_data_url, data=dict(date_range=date_range, submitP='Search'))
erc_content = BeautifulSoup(resp.content)
erc_data_table = erc_content.find(name='div', attrs={'class':"item-page"}).table
data_rows = erc_data_table.find('table').findAll('tr')
fields = [header.text for header in data_rows[0].findAll('th')]
rows = []
for row in data_rows[1:]:
row_dict = dict(zip(fields, [column.text for column in row.findAll('td')]))
row_dict['date'] = date_range_datetime
rows.append(row_dict)
return rows
def get_erc_data_as_dict():
rows_of_dicts = []
for available_date in get_available_erc_data_dates(erc_url):
print "Fetching data for %s" % available_date.get('value')
rows = fetch_erc_data_for_date_range(erc_url, available_date.get('date_range'), available_date.get('datetime'))
rows_of_dicts.extend(rows)
return rows_of_dicts
def get_erc_data_as_df():
return pandas.DataFrame(get_erc_data_as_dict())
In [8]:
erc_prices_df = get_erc_data_as_df()
Fetching data for Jan 15th, 2016 - Feb 14th, 2016 Fetching data for Dec 15th, 2015 - Jan 14th, 2016 Fetching data for Nov 15th, 2015 - Dec 14th, 2015 Fetching data for Oct 15th, 2015 - Nov 14th, 2015 Fetching data for Sep 15th, 2015 - Oct 14th, 2015 Fetching data for Aug 15th, 2015 - Sep 14th, 2015 Fetching data for Jul 15th, 2015 - Aug 14th, 2015 Fetching data for Jun 15th, 2015 - Jul 14th, 2015 Fetching data for May 15th, 2015 - Jun 14th, 2015 Fetching data for Apr 15th, 2015 - May 14th, 2015 Fetching data for Mar 15th, 2015 - Apr 14th, 2015 Fetching data for Feb 16th, 2015 - Mar 15th, 2015 Fetching data for Jan 15th, 2015 - Feb 14th, 2015 Fetching data for Dec 15th, 2014 - Jan 14th, 2015 Fetching data for Nov 15th, 2014 - Dec 14th, 2014 Fetching data for Oct 15th, 2014 - Nov 14th, 2014 Fetching data for Sep 15th, 2014 - Oct 15th, 2014 Fetching data for Aug 15th, 2014 - Sep 14th, 2014 Fetching data for Jul 15th, 2014 - Aug 14th, 2014 Fetching data for Jun 15th, 2014 - Jul 15th, 2014 Fetching data for May 15th, 2014 - Jun 15th, 2014 Fetching data for Apr 15th, 2014 - May 14th, 2014 Fetching data for Mar 15th, 2014 - Apr 14th, 2014 Fetching data for Feb 15th, 2014 - Mar 14th, 2014 Fetching data for Jan 15th, 2014 - Feb 14th, 2014 Fetching data for Dec 15th, 2013 - Jan 14th, 2014 Fetching data for Nov 15th, 2013 - Dec 14th, 2013 Fetching data for Oct 15th, 2013 - Nov 14th, 2013 Fetching data for Sep 15th, 2013 - Oct 14th, 2013 Fetching data for Aug 15th, 2013 - Sep 14th, 2013 Fetching data for Jul 15th, 2013 - Aug 14th, 2013 Fetching data for Jun 15th, 2013 - Jul 14th, 2013 Fetching data for May 15th, 2013 - Jun 14th, 2013 Fetching data for Apr 15th, 2013 - May 14th, 2013 Fetching data for Mar 15th, 2013 - Apr 14th, 2013 Fetching data for Feb 15th, 2013 - Mar 14th, 2013
/Users/mickeymouse/venvs/default/lib/python2.7/site-packages/bs4/__init__.py:166: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently. To get rid of this warning, change this: BeautifulSoup([your markup]) to this: BeautifulSoup([your markup], "lxml") markup_type=markup_type))
In [32]:
# Cached at: http://pastebin.com/Z8vqDCgr
# Loadable data: http://pastebin.com/raw/Z8vqDCgr
# data_url = PastebinAPI().paste(pastebin_api_key, erc_prices_df.to_csv(), paste_name="ERC Kenya Petroleum prices", paste_format="text")
# print data_url
http://pastebin.com/Z8vqDCgr
In [9]:
erc_prices_df['super_normed_price'] = erc_prices_df.Super.astype(float) / max(erc_prices_df.Super.astype(float))
zero_idx = erc_prices_df.super_normed_price != 0
erc_prices_df_clean = erc_prices_df[zero_idx]
erc_prices_df_clean.head()
Out[9]:
Diesel | Kerosene | Regular | Super | Town | date | super_normed_price | |
---|---|---|---|---|---|---|---|
0 | 79.55 | 48.93 | 108.03 | 91.31 | Bomet | 2016-01-15 | 0.700069 |
1 | 79.14 | 48.32 | 107.58 | 90.9 | Bondo | 2016-01-15 | 0.696926 |
2 | 79.55 | 48.8 | 108.01 | 91.3 | Bungoma | 2016-01-15 | 0.699992 |
3 | 80.14 | 49.32 | 108.48 | 91.9 | Busia | 2016-01-15 | 0.704593 |
4 | 78.44 | 47.87 | 107.34 | 90.38 | Chuka | 2016-01-15 | 0.692939 |
In [10]:
erc_prices_nairobi_df = erc_prices_df_clean[erc_prices_df.Town=='Nairobi']
erc_prices_nairobi_df.head()
/Users/mickeymouse/venvs/default/lib/python2.7/site-packages/pandas/core/frame.py:1997: UserWarning: Boolean Series key will be reindexed to match DataFrame index. "DataFrame index.", UserWarning)
Out[10]:
Diesel | Kerosene | Regular | Super | Town | date | super_normed_price | |
---|---|---|---|---|---|---|---|
50 | 76.7 | 46.13 | 105.76 | 88.64 | Nairobi | 2016-01-15 | 0.679598 |
120 | 78.51 | 53.27 | 105.76 | 90.06 | Nairobi | 2015-12-15 | 0.690485 |
190 | 79.66 | 54.81 | 105.76 | 90.46 | Nairobi | 2015-11-15 | 0.693552 |
260 | 82.43 | 56.04 | 105.76 | 93.29 | Nairobi | 2015-10-15 | 0.715250 |
330 | 79.99 | 52.55 | 0 | 102.65 | Nairobi | 2015-09-15 | 0.787012 |
In [46]:
g = ggplot.ggplot(erc_prices_nairobi_df, ggplot.aes('date', 'Super')) + ggplot.geom_line()
g = g + ggplot.scale_x_date(labels = ggplot.date_format("%Y-%m-%d"), breaks=ggplot.date_breaks('1 months'))
g = g + ggplot.theme(axis_text_x = ggplot.element_text(angle = 90)) + ggplot.ggtitle("Nairobi Super Pump Price")
g
Out[46]:
<ggplot: (296571333)>
In [35]:
min_erc_date = min(erc_prices_nairobi_df.date)
erc_crude_date_intersection_df = crude_df[crude_df.date > min_erc_date]
# Renorm the data
erc_crude_date_intersection_df['crude_normed_price'] = erc_crude_date_intersection_df.price / max(erc_crude_date_intersection_df.price)
# Renorm this field for the subsetted dataframe
erc_prices_nairobi_df['super_normed_price'] = erc_prices_nairobi_df.super_normed_price / max(erc_prices_nairobi_df.super_normed_price)
/Users/mickeymouse/venvs/default/lib/python2.7/site-packages/ipykernel/__main__.py:5: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy /Users/mickeymouse/venvs/default/lib/python2.7/site-packages/ipykernel/__main__.py:7: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
In [44]:
# So let's compare the normalised crude oil purchase price to the normalised Nairobi Pump price
# Ideally, these graphs should be identical!
# If the oligopolies argue that they are a few months late in tracking the global crude prices,
# we should see the red dot graph slightly right shifted over the blue one.
g = ggplot.ggplot(erc_crude_date_intersection_df, ggplot.aes('date', 'crude_normed_price')) + ggplot.geom_line(colour="green")
g = g + ggplot.geom_point(ggplot.aes(x='date', y='super_normed_price'), data=erc_prices_nairobi_df, color="red")
g = g + ggplot.scale_x_date(labels = ggplot.date_format("%Y-%m-%d"), breaks=ggplot.date_breaks('1 months'))
g = g + ggplot.theme(axis_text_x = ggplot.element_text(angle = 90)) + ggplot.ggtitle("% drop in Brent Crude Cost vs Pump Prices in Kenya")
g
Out[44]:
<ggplot: (295938557)>
In [43]:
# Let's plot the difference between the normalised crude oil price and the pump price,
# If profits margins are the same, we should see a straight line here, otherwise,
# - if the graph goes up with time, we are getting shafted.
# - if the graph goes down with time, we are getting a good deal.
# We have to renormalize again because merging may drop some rows.
merged_df = pandas.merge(erc_crude_date_intersection_df, erc_prices_nairobi_df, how='inner', on=['date', 'date'])
merged_df['normalized_margins'] = merged_df.super_normed_price / merged_df.crude_normed_price
merged_df['super_normed_price'] = merged_df.Super.astype(float) / max(merged_df.Super.astype(float))
merged_df['crude_normed_price'] = merged_df.price / max(merged_df.price)
g = ggplot.ggplot(merged_df, ggplot.aes('date', 'normalized_margins')) + ggplot.geom_line(colour="red")
g = g + ggplot.scale_x_date(labels = ggplot.date_format("%Y-%m-%d"), breaks=ggplot.date_breaks('1 months'))
g = g + ggplot.theme(axis_text_x = ggplot.element_text(angle = 90)) + ggplot.ggtitle("Estimated Profit Margins based on Crude Prices vs Pump prices for Kenya")
g
Out[43]:
<ggplot: (295937509)>
In [15]:
# As you can see from the graph above, we are thoroughly getting shafted.
# 1. 'Super' petrol margins have gone up by >200% over the past 2 years.
# 2. It's rather obvious that petrol companies in Kenya are not tracking global prices.
# If anything, they are keeping prices artificially high and pocketing the change
# In other words, Petrol companies are making 2-3 times the profit they made in 2013
In [16]:
# So the arguement might be that the Kenyan currency has lost value.
# We can work around this by converting the KES pump price to the equivalent USD value
# at the time by using historical data from CBK.
In [17]:
# Let's fetch Central bank of Kenya Forex data
# Source https://www.centralbank.go.ke/index.php/interest-rates/commercial-banks-weighted-av
cbk_data_url = 'https://www.centralbank.go.ke/index.php/interest-rates/commercial-banks-weighted-av'
def fetch_cbk_data_as_dicts(cbk_data_url):
resp = requests.post(cbk_data_url,
data=dict(
# Start date
date=22, month=1, year=2007,
# End date
tdate=22, tmonth=1, tyear=2016,
currency='US DOLLAR', searchForex='Search'),
# CBK has a braindead policy of blocking non-ui browser agents
headers={'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:43.0) Gecko/20100101 Firefox/43.0'})
cbk_content = BeautifulSoup(resp.content)
cbk_data_table_header = cbk_content.find(name='div', attrs={'id':"interbank"}).table.find('tr', attrs={'class': 'tr_heading'})
headers = [td.text for td in cbk_data_table_header.findAll('td')]
rows_of_dicts = []
for table_row in cbk_data_table_header.fetchNextSiblings():
row = [td.text for td in table_row.findAll('td')]
rows_of_dicts.append(dict(zip(headers, row)))
return rows_of_dicts
def fetch_cbk_data_as_df(cbk_data_url):
return pandas.DataFrame(fetch_cbk_data_as_dicts(cbk_data_url))
In [18]:
cbk_usd_df = fetch_cbk_data_as_df(cbk_data_url)
cbk_usd_df['date'] = cbk_usd_df.Date.map(lambda x: dateutil.parser.parse(x, dayfirst=True))
cbk_usd_df = cbk_usd_df.sort(['date'], ascending=True)
cbk_usd_df.head(15)
/Users/mickeymouse/venvs/default/lib/python2.7/site-packages/ipykernel/__main__.py:3: FutureWarning: sort(columns=....) is deprecated, use sort_values(by=.....) app.launch_new_instance()
Out[18]:
Buy | Currency | Date | Mean | Sell | date | |
---|---|---|---|---|---|---|
76 | 69.8256 | US DOLLAR | 22-01-2007 | 69.8978 | 69.9700 | 2007-01-22 |
77 | 70.0011 | US DOLLAR | 23-01-2007 | 70.0733 | 70.1456 | 2007-01-23 |
78 | 70.0711 | US DOLLAR | 24-01-2007 | 70.1433 | 70.2156 | 2007-01-24 |
79 | 70.2300 | US DOLLAR | 25-01-2007 | 70.2994 | 70.3689 | 2007-01-25 |
80 | 70.4411 | US DOLLAR | 26-01-2007 | 70.5056 | 70.5700 | 2007-01-26 |
81 | 70.5533 | US DOLLAR | 29-01-2007 | 70.6256 | 70.6978 | 2007-01-29 |
82 | 70.4956 | US DOLLAR | 30-01-2007 | 70.5567 | 70.6178 | 2007-01-30 |
83 | 70.4756 | US DOLLAR | 31-01-2007 | 70.5367 | 70.5978 | 2007-01-31 |
84 | 70.3167 | US DOLLAR | 01-02-2007 | 70.3722 | 70.4278 | 2007-02-01 |
85 | 70.2000 | US DOLLAR | 02-02-2007 | 70.2722 | 70.3444 | 2007-02-02 |
86 | 70.1411 | US DOLLAR | 05-02-2007 | 70.2078 | 70.2744 | 2007-02-05 |
87 | 69.9544 | US DOLLAR | 06-02-2007 | 70.0211 | 70.0878 | 2007-02-06 |
88 | 69.9033 | US DOLLAR | 07-02-2007 | 69.9700 | 70.0367 | 2007-02-07 |
89 | 69.6667 | US DOLLAR | 08-02-2007 | 69.7389 | 69.8111 | 2007-02-08 |
90 | 69.6533 | US DOLLAR | 09-02-2007 | 69.7256 | 69.7978 | 2007-02-09 |
In [39]:
# Cached here: http://pastebin.com/345FHtWH
# data_url = PastebinAPI().paste(pastebin_api_key, cbk_usd_df.to_csv(), paste_name="CBK Kenya USD Forex rates", paste_format="text")
# print data_url
http://pastebin.com/345FHtWH
In [42]:
# Let's see what the USD / KES rate looks like
g = ggplot.ggplot(cbk_usd_df, ggplot.aes('date', 'Buy')) + ggplot.geom_line(colour="red")
g = g + ggplot.scale_x_date(labels = ggplot.date_format("%Y-%m-%d"), breaks=ggplot.date_breaks('3 month'))
g = g + ggplot.theme(axis_text_x = ggplot.element_text(angle = 90)) + ggplot.ggtitle("Kenya USD Forex Rate")
g
Out[42]:
<ggplot: (295776609)>
In [86]:
merged_df_with_usd = pandas.merge(merged_df, cbk_usd_df, how='inner', on=['date', 'date'])
# Convert the super price to USD
merged_df_with_usd['super_as_usd'] = merged_df_with_usd.Super.astype(float) / merged_df_with_usd.Mean.astype(float)
# Normalise the USD super price
merged_df_with_usd['normalised_super_as_usd'] = merged_df_with_usd.super_as_usd / max(merged_df_with_usd.super_as_usd)
# And the USD Margins
merged_df_with_usd['normalized_margins_usd'] = merged_df_with_usd.normalised_super_as_usd / merged_df_with_usd.crude_normed_price
merged_df_with_usd.head()
Out[86]:
day | price | date | crude_normed_price | Diesel | Kerosene | Regular | Super | Town | super_normed_price | normalized_margins | Buy | Currency | Date | Mean | Sell | super_as_usd | normalised_super_as_usd | normalized_margins_usd | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 03/15/2013 | 109.32 | 2013-03-15 | 0.979745 | 107.37 | 88.54 | 105.76 | 117.69 | Nairobi | 1.000000 | 1.071625 | 85.5389 | US DOLLAR | 15-03-2013 | 85.6361 | 85.7333 | 1.374304 | 1.000000 | 1.020673 |
1 | 05/15/2013 | 101.57 | 2013-05-15 | 0.910289 | 101.06 | 79.46 | 105.76 | 113.24 | Nairobi | 0.962189 | 1.109781 | 83.7222 | US DOLLAR | 15-05-2013 | 83.8069 | 83.8917 | 1.351201 | 0.983190 | 1.080086 |
2 | 07/15/2013 | 109.05 | 2013-07-15 | 0.977326 | 102.86 | 79.49 | 105.76 | 109.52 | Nairobi | 0.930580 | 0.999702 | 87.1344 | US DOLLAR | 15-07-2013 | 87.2364 | 87.3383 | 1.255439 | 0.913509 | 0.934703 |
3 | 08/15/2013 | 111.58 | 2013-08-15 | 1.000000 | 104.44 | 83.93 | 105.76 | 112.26 | Nairobi | 0.953862 | 1.001478 | 87.5044 | US DOLLAR | 15-08-2013 | 87.5906 | 87.6767 | 1.281644 | 0.932577 | 0.932577 |
4 | 10/15/2013 | 110.67 | 2013-10-15 | 0.991844 | 104.47 | 85.01 | 105.76 | 112.27 | Nairobi | 0.953947 | 1.009803 | 84.8944 | US DOLLAR | 15-10-2013 | 84.9944 | 85.0944 | 1.320911 | 0.961149 | 0.969052 |
In [41]:
# How does the USD price of the pump prices in Kenya look like?
g = ggplot.ggplot(merged_df_with_usd, ggplot.aes('date', 'super_as_usd')) + ggplot.geom_line(colour="green")
g = g + ggplot.theme(axis_text_x = ggplot.element_text(angle = 90)) + ggplot.ggtitle("Kenya Super Price (USD)")
g
Out[41]:
<ggplot: (295167589)>
In [51]:
# So let's compare how all three normalised prices.
# If there's a huge departured between the normalised_USD pump price and the KES pump price in the graph
# below, we can the ascribe the margin deltas to the devaluation of KES.
# If there change between the two graphs is marginal, then we can infer that the devaluation arguement is moot
g = ggplot.ggplot(merged_df_with_usd, ggplot.aes('date', 'crude_normed_price')) + ggplot.geom_line(colour="green")
g = g + ggplot.geom_point(ggplot.aes(x='date', y='super_normed_price'), data=merged_df_with_usd, color="red")
g = g + ggplot.scale_x_date(labels = ggplot.date_format("%Y-%m-%d"), breaks=ggplot.date_breaks('1 months'))
g = g + ggplot.geom_point(ggplot.aes(x='date', y='normalised_super_as_usd'), data=merged_df_with_usd, color="blue")
g = g + ggplot.theme(axis_text_x = ggplot.element_text(angle = 90)) + ggplot.ggtitle("% Price Drop of KES Kenya Pump Price (Red dot) vs USD Kenya Pump Price (Blue dot) vs Crude Price (Green line)")
g
Out[51]:
<ggplot: (297819617)>
In [62]:
# US Gas prices. Source: https://ycharts.com/indicators/gas_price
us_gas_price_url = 'https://ycharts.com/charts/fund_data/json?securities=type%3Aindicator%2Cid%3AI%3AUSRGP%2Cinclude%3Atrue%2C%2C&calcs=id%3Aprice%2Cinclude%3Atrue%2C%2C&correlations=&format=real&recessions=false&zoom=5&startDate=&endDate=&chartView=&splitType=&scaleType=¬e=&title=&source=&units="eLegend=&partner="es=&legendOnChart=&securitylistSecurityId=&clientGroupLogoUrl=&maxPoints=650&'
def get_us_gas_prices(us_gas_price_url):
resp = requests.get(us_gas_price_url)
us_gas_prices_json = resp.json()
return [{'date': datetime.fromtimestamp(row[0] / 1000), 'gallon_price': row[1]}
for row in us_gas_prices_json['chart_data'][0][0]['raw_data']]
us_gas_prices = get_us_gas_prices(us_gas_price_url)
us_gas_prices_df = pandas.DataFrame(us_gas_prices)
In [52]:
# Cached here: http://pastebin.com/Jda1iAAT
# data_url = PastebinAPI().paste(pastebin_api_key, us_gas_prices_df.to_csv(), paste_name="US Gas Cost from ycharts", paste_format="text")
# print data_url
http://pastebin.com/Jda1iAAT
In [64]:
g = ggplot.ggplot(us_gas_prices_df, ggplot.aes('date', 'gallon_price')) + ggplot.geom_line(colour="red")
g = g + ggplot.scale_x_date(labels = ggplot.date_format("%Y-%m-%d"), breaks=ggplot.date_breaks('1 months'))
g = g + ggplot.theme(axis_text_x = ggplot.element_text(angle = 90)) + ggplot.ggtitle("US Pump Price per Gallon")
g
Out[64]:
<ggplot: (298367705)>
In [77]:
# Normalized and compared to KE & crude prices
min_erc_date = min(erc_prices_nairobi_df.date)
erc_us_gas_prices_df_intersection_df = us_gas_prices_df[us_gas_prices_df.date > min_erc_date.date()]
erc_us_gas_prices_df_intersection_df['normalized_gallon_price'] = erc_us_gas_prices_df_intersection_df.gallon_price / max(erc_us_gas_prices_df_intersection_df.gallon_price)
g = ggplot.ggplot(merged_df_with_usd, ggplot.aes('date', 'crude_normed_price')) + ggplot.geom_line(colour="green")
g = g + ggplot.geom_point(ggplot.aes(x='date', y='super_normed_price'), data=merged_df_with_usd, color="red")
g = g + ggplot.geom_point(ggplot.aes(x='date', y='normalized_gallon_price'), data=erc_us_gas_prices_df_intersection_df, color="brown")
g = g + ggplot.scale_x_date(labels = ggplot.date_format("%Y-%m-%d"), breaks=ggplot.date_breaks('1 months'))
g = g + ggplot.geom_point(ggplot.aes(x='date', y='normalised_super_as_usd'), data=merged_df_with_usd, color="blue")
g = g + ggplot.theme(axis_text_x = ggplot.element_text(angle = 90)) + ggplot.ggtitle("% Price Drop of KES Kenya Pump Price (Red dot) vs USD Kenya Pump Price (Blue dot) \n vs Crude Price (Green line) vs US Pump Prices(Brown)")
g
/Users/mickeymouse/venvs/default/lib/python2.7/site-packages/ipykernel/__main__.py:5: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
Out[77]:
<ggplot: (298598673)>
In [87]:
# Granted that the KES has devalued in th recent past, there devalution has changed the USD pump price by at most 10%
# If we plot the margins, we should still see that the Oilers are making significantly more in profits
g = ggplot.ggplot(merged_df_with_usd, ggplot.aes('date', 'normalized_margins')) + ggplot.geom_line(colour="red")
g = g + ggplot.geom_point(ggplot.aes(x='date', y='normalized_margins_usd'), data=merged_df_with_usd, color="red")
g = g + ggplot.scale_x_date(labels = ggplot.date_format("%Y-%m-%d"), breaks=ggplot.date_breaks('1 months'))
g = g + ggplot.theme(axis_text_x = ggplot.element_text(angle = 90))
g
Out[87]:
<ggplot: (300572801)>
In [96]:
# The points above still show that exchange rate impact is minimal to profit margins.
# If the oilers are to get the same profits as 2014-10-01 (~1.4 over the Brent crude as opposed to the current ~3.0),
# then the pump prices should be current price * 1.4 / 3.0
#In other words, we should be paying
todays_row = merged_df_with_usd.tail(1)
todays_row
fair_price = todays_row['crude_normed_price'].astype('float') * max(merged_df_with_usd['super_as_usd']).astype('float') * todays_row['Mean'].astype('float')
print "Fair pump price (KES): ",fair_price.values[0]
# Yep the figure is 36.30 Shillings.
# As USD
fair_price_usd = fair_price.values[0] / todays_row['Mean'].astype(float)
print "Fair pump price (USD): ", fair_price_usd.values[0]
# According to this: http://www.thenakedscientists.com/forum/index.php?topic=42516.0
# 1 barrel of oil ~ 42 us gallons
barrel_in_liters = 3.79 * 42
# Total cost of raw crude per liter in KES
raw_crude_cost_per_liter = todays_row['Mean'].astype(float) * todays_row['price'] / barrel_in_liters
print "Raw crude cost per liter (KES): ", raw_crude_cost_per_liter.values[0]
print "Cost of distillation + profit as a function of the raw product per liter (Unfair price):", (todays_row['Super'].astype('float') - raw_crude_cost_per_liter.values[0]).values[0]
print "Cost of distillation + profit as a function of the raw product per liter (Fair price):", (fair_price - raw_crude_cost_per_liter.values[0]).values[0]
print "Oilers extra profit per liter (over and above 2013 profit margins): ", (todays_row['Super'].astype('float') - fair_price).values[0]
Fair pump price (KES): 36.3067410612 Fair pump price (USD): 0.354722562333 Raw crude cost per liter (KES): 18.5183565775 Cost of distillation + profit as a function of the raw product per liter (Unfair price): 70.1216434225 Cost of distillation + profit as a function of the raw product per liter (Fair price): 17.7883844837 Oilers extra profit per liter (over and above 2013 profit margins): 52.3332589388
In [97]:
# Summary
# Sources: https://ycharts.com/indicators/gas_price
# 1. US Gas (Petrol) prices were 3.764 in Mar 18 2013,
# 2. US Gas prices are now 2.02 in Jan 18 2016
# 3. Total percentage change: 46% drop in price: -46.3336875664187
print "Drop in US pump price:", (2.02 - 3.764) * 100 / 3.764
# 4. KE GAS prices in Mar 15 2013 was: 117.69
# 5. KE Gas Prices are now: 88.64
# 6. Total percentage change: 24% drop in price.
print "Drop in KE pump price:", (88.64 - 117.69) * 100 / 117.69
# Let's take care of devaluation
# 4. KE USD GAS prices in Mar 15 2013 was: 1.374304
# 5. KE USD Gas Prices are now: 0.866027
# 6. Total percentage change: 24% drop in price.
print "Drop in KE (USD) pump price:", (0.866027 - 1.374304) * 100 / 1.374304
# If we were to pay the US rate, this is the pump price that we should pay (inclusive of currency devaluation):
print "Pump price if we were to drop our prices to the same level as the US...: ", 46.3336875664/ 100 * 1.374304 * 102.3525
#(US % Drop) * Price in Mar 2013 as USD * Current USD/KES Forex rate
Drop in US pump price: -46.3336875664 Drop in KE pump price: -24.683490526 Drop in KE (USD) pump price: -36.9843207907 Pump price if we were to drop our prices to the same level as the US...: 65.1745635173
In [ ]:
Subscribe to:
Posts (Atom)