Prezto is pretty cool. I love it. Here are my settings :)
Friday, October 31, 2014
Tuesday, August 19, 2014
flask ( or itsdangerous ) secret key size.
I recently need to figure out the recommended key size for flask's secret key. Trawling through flasks' source, I discovered that it's using itsdangerous for signing. The signer in turn uses hmac with a defined hash algorithm or a default one. The default digest method in itsdangerous is SHA-1.
According to wikipedia:
....
My secret key block then becomes:
os.urandom from the stdlib may not cut it since it sources /dev/urandom.
Caveat: YMMV, I am not a security expert!
According to wikipedia:
The cryptographic strength of the HMAC depends upon the size of the secret key that is used.The HMAC RFC in turn states that:
2. Definition of HMAC
....The definition of HMAC requires a cryptographic hash function, which we denote by H, and a secret key K. We assume H to be a cryptographic hash function where data is hashed by iterating a basic compression function on blocks of data. We denote by B the byte-length of such blocks (B=64 for all the above mentioned examples of hash functions), and by L the byte-length of hash outputs (L=16 for MD5, L=20 for SHA-1). The authentication key K can be of any length up to B, the block length of the hash function. Applications that use keys longer than B bytes will first hash the key using H and then use the resultant L byte string as the actual key to HMAC. In any case the minimal recommended length for K is L bytes (as the hash output length). See section 3 for more information on keys.
....
So that in effect means that our secret key should be 16 bytes for MD5, 20 bytes for SHA1 and larger if you use SHA-2 or SHA-3. Use the output bits column of this table to figure out what your secret key size ought to be. For the flask secret key size, I believe that a 32 byte secret key should be sufficient (and a 16 byte secret key risky... :)3. Keys The key for HMAC can be of any length (keys longer than B bytes are first hashed using H). However, less than L bytes is strongly discouraged as it would decrease the security strength of the function. Keys longer than L bytes are acceptable but the extra length would not significantly increase the function strength. (A longer key may be advisable if the randomness of the key is considered weak.) Keys need to be chosen at random (or using a cryptographically strong pseudo-random generator seeded with a random seed), and periodically refreshed. (Current attacks do not indicate a specific recommended frequency for key changes as these attacks are practically infeasible. However, periodic key refreshment is a fundamental security practice that helps against potential weaknesses of the function and keys, and limits the damage of an exposed key.)
My secret key block then becomes:
KEY_SIZE=32
SECRET_KEY = open("/dev/random","rb").read(KEY_SIZE)
os.urandom from the stdlib may not cut it since it sources /dev/urandom.
Caveat: YMMV, I am not a security expert!
Thursday, July 31, 2014
[Announce] BARCAMP NAIROBI 2014 – Who’s Your Data’s Daddy?
Nairobi’s premier technology event is back for 2014! The 8th Barcamp Nairobi will be held on Saturday, 30th August, 2014. Barcamp is produced by Skunkworks Kenya – a disruptive collective of Kenya’s best looking and best skilled techies - and will be jointly hosted for the 2nd year by iHub, Nailab and m:Lab East Africa at Bishop Magua Centre, Ngong Road.
Barcamp is an unconference - participants run it. Anyone and everyone can attend. Please join us by registering here. Attendees set the agenda for what’s discussed, lead the sessions and workshops that fill the schedule, and create an environment of innovation and productive discussion.
Who should attend: the curious, the unconventional, the brilliant, the resilient, thinkers, hackers, crackers, builders, coders, techies, writers, artists, ninjas, everyone.
- Come prepared to: share ideas, challenge ideas, engage with others
- Bring: gadgets, code, designs, community attitude, friends, deodorant
- Don’t bring: wordy powerpoint presentations, hubris, suits and ties
Hashtag #BarcampNBI
The theme for Barcamp Nairobi 2014 is:
Who's Your Data's Daddy?Is privacy and security online possible in Kenya?
We entrust our most sensitive, private, and important information to private technology companies. At the same time the increasing usage of technology has attracted the attention of authorities eager to provide caveats on the openness of the Internet and the range of freedoms, which we enjoy online.
At Barcamp Nairobi 2014 we are eager to talk about privacy and surveillance, we will explore if there are any strategies and solutions that Kenyan citizens, corporations and governments are using to protect their privacy and security online.
Tuesday, July 8, 2014
file locking using a context manager (with statement) in python
I needed a quick locking mechanism to prevent my daemons from stepping over each other. To have a sane daemon startup (and prevent multiple daemon spawns), we need to ensure that we have an exclusive lock before starting the program. Googling around didn't lead to show any context managers that actually use the flock syscalls.
So here goes my attempt that seems to work:
Spinning off some python processes that utilise this context manager shows serialisation taking place:
And here's the output of lsof showing locking for the processes spun off above:
So here goes my attempt that seems to work:
Spinning off some python processes that utilise this context manager shows serialisation taking place:
And here's the output of lsof showing locking for the processes spun off above:
Monday, May 19, 2014
Redistilling PDFs that are not portable by design
I hate it when I am forced to deal with documents that are portable in title only (yes, I am looking at your Adobe). Every so often, I do get pdf documents from a major organisation that can viewed by Adobe Acrobat only. On OSX, this bloated application consumes 369 Megabytes of precious SSD space (preview consumes 29 Megabytes and is nicer).
Anyway, back to the story, these documents cannot be saved in any other format on my machine. In fact, the only way to read these documents w/out hackery is to print them out and rescan them back.
!Stupid!
So here goes a recipe for saving these files in a portable way.
Anyway, back to the story, these documents cannot be saved in any other format on my machine. In fact, the only way to read these documents w/out hackery is to print them out and rescan them back.
!Stupid!
So here goes a recipe for saving these files in a portable way.
Saturday, May 17, 2014
Subnet calculation using pure mysql
You can easily aggregate your records by subnets using mysql thanks to bitwise operators, an inet_aton (ascii to number function) and some thinking...
Here you go:
Here you go:
Thursday, May 15, 2014
tshark: display filters + reporting using csv
You can do pretty nifty things with tshark. The absolute life saver is thsark's ability to dump to a csv/tsv file using a user specified display filter.
As an example, I'd like to point out some packet retransmission issues to my provider in a nice (manager friendly) spreadsheet. Here we go:
Manager friendly output:
ip.src | tcp.srcport | ip.dst | tcp.dstport | tcp.flags.syn | tcp.flags.ack | tcp.flags.push | tcp.flags.reset | tcp.analysis.bytes_in_flight | tcp.len |
a.b.c.d | 8645 | e.f.g.h7 | 9999 | 1 | 0 | 0 | 0 | 0 | |
e.f.g.h7 | 9999 | a.b.c.d | 8645 | 1 | 1 | 0 | 0 | 0 | |
a.b.c.d | 8645 | e.f.g.h7 | 9999 | 0 | 1 | 0 | 0 | 0 | |
a.b.c.d | 8645 | e.f.g.h7 | 9999 | 0 | 1 | 1 | 0 | 168 | 168 |
e.f.g.h7 | 9999 | a.b.c.d | 8645 | 0 | 1 | 0 | 0 | 0 | |
e.f.g.h7 | 9999 | a.b.c.d | 8645 | 0 | 1 | 1 | 0 | 1154 | 1154 |
a.b.c.d | 8645 | e.f.g.h7 | 9999 | 0 | 1 | 0 | 0 | 0 | |
a.b.c.d | 8645 | e.f.g.h7 | 9999 | 0 | 1 | 0 | 0 | 1448 | 1448 |
a.b.c.d | 8645 | e.f.g.h7 | 9999 | 0 | 1 | 1 | 0 | 1502 | 54 |
e.f.g.h7 | 9999 | a.b.c.d | 8645 | 0 | 1 | 0 | 0 | 0 |
How do we get there?
1. Identify the fields that you want. A wireshark display filter cheat-sheet is a good place to start. You can home in on the fields that you want by firing up Wireshark and using the expression builder (button right next to the filter input box) then selecting the protocol that you want.
2. Choose your TCP stream.
3. Assemble your command. The one used to display the output above is:
Partitions in Postgres: Automatically creating partitions based on an attribute
A long time ago... I worked on importing ~ half a billion log records into Postgres. To achieve a low query response time, I used a partitioner that would shard records monthly. I documented it in the Postgres docs
Here it is:
Here it is:
Sunday, March 16, 2014
Making sense of /proc/buddyinfo
/proc/buddyinfo gives you an idea about the free memory fragments on your Linux box. You get to view the free fragments for each available order, for the different zones of each numa node. The typical /proc/buddyinfo looks like this:
This box has a single numa node. Each numa node is an entry in the kernel linked list pgdat_list. Each node is further divided into zones. Here are some example zone types:
Say we have just rebooted the machine and we have a free pool of 16MiB (DMA zone). The most sensible thing to do would be to have the this memory split into largest contiguous blocks available. The largest order is defined at compile time to 11 which means that the largest slice the buddy allocator has is 4MiB block (2^10 * page_size). so the 16 MiB DMA zone would initially split into 4 free blocks.
Here's how we'll service an allocation request for 72KiB:
Here's an example of an allocation failure from a Gentoo bug report.
In such cases, the buddyinfo proc file will allow you to view the current fragmentation state of your memory.
Here's a quick python script that will make this data more digestible.
And sample output for the buddyinfo data pasted earlier on.
This box has a single numa node. Each numa node is an entry in the kernel linked list pgdat_list. Each node is further divided into zones. Here are some example zone types:
- DMA Zone: Lower 16 MiB of RAM used by legacy devices that cannot address anything beyond the first 16MiB of RAM.
- DMA32 Zone (only on x86_64): Some devices can't address beyond the first 4GiB of RAM. On x86, this zone would probably be covered by Normal zone
- Normal Zone: Anything above zone DMA and doesn't require kernel tricks to be addressable. Typically on x86, this is 16MiB to 896MiB. Many kernel operations require that the memory being used be from this zone
- Highmem Zone (x86 only): Anything above 896MiB.
Say we have just rebooted the machine and we have a free pool of 16MiB (DMA zone). The most sensible thing to do would be to have the this memory split into largest contiguous blocks available. The largest order is defined at compile time to 11 which means that the largest slice the buddy allocator has is 4MiB block (2^10 * page_size). so the 16 MiB DMA zone would initially split into 4 free blocks.
Here's how we'll service an allocation request for 72KiB:
- Round up the allocation request to the next power of 2 (128)
- Split a 4MiB chunk into two 2MiB chunks
- Split one 2 MiB chunk into two MiB chunks
- Continue splitting until we get a 128KiB chunk that we'll allocate.
Here's an example of an allocation failure from a Gentoo bug report.
In such cases, the buddyinfo proc file will allow you to view the current fragmentation state of your memory.
Here's a quick python script that will make this data more digestible.
And sample output for the buddyinfo data pasted earlier on.
Wednesday, January 29, 2014
Fifos and persistent readers
I recently worked on a daemon (call it slurper) that persistently read data from syslog via a FIFO (also known as a named pipe). On startup, slurper would work fine for a couple of hours then stop processing input from the FIFO. The relevant code in slurper is:
Digging into this mystery revealed that syslogd server was getting EAGAIN errors on the fifo descriptor. According to man 7 pipe:
The syslogd daemon was opening the pipe in O_NONBLOCK mode and getting EAGAIN errors which implied that the pipe was full. (man 7 pipe states that the pipe buffer is 64K).
Additionally, a `cat` on the FIFO drains the pipe and allows syslogd to write more content.
All these clues imply that the FIFO has no reader. But how can that be? A check on lsof shows that slurper has an open fd for the named pipe. Digging deeper, an attempt to `cat` slurpers' open fd didn't return any data
So I decided to whip up a reader that emulates slurper's behaviour
Strace this script to see which syscalls are being invoked
This reveals that a writer closing it's fd will cause readers to read an EOF (and probably exit in the case of the block under the context manager).
So we have two options:
1) Ugly and kludgy: Wrap the context manager read block within an infinite loop the reopens the file:
2) Super cool trick. Open another dummy writer to the FIFO. The kernel sends an EOF when the last writer closes it's fd. Since our dummy writer never closes the fd, readers will never get an EOF if the real writer closes it's fd.
The actual root cause: The syslog daemon was being restarted and this would cause it to close and reopen it's fds.
Digging into this mystery revealed that syslogd server was getting EAGAIN errors on the fifo descriptor. According to man 7 pipe:
O_NONBLOCK enabled, n <= PIPE_BUF
If there is room to write n bytes to the pipe, then write(2) succeeds immediately, writing all n bytes; otherwise write(2) fails, with errno set to EAGAIN.
The syslogd daemon was opening the pipe in O_NONBLOCK mode and getting EAGAIN errors which implied that the pipe was full. (man 7 pipe states that the pipe buffer is 64K).
Additionally, a `cat` on the FIFO drains the pipe and allows syslogd to write more content.
All these clues imply that the FIFO has no reader. But how can that be? A check on lsof shows that slurper has an open fd for the named pipe. Digging deeper, an attempt to `cat` slurpers' open fd didn't return any data
cat /proc/$(pgrep slurper)/fd/# Be careful with this. It will steal data from your pipe/file/socket on a production system
So I decided to whip up a reader that emulates slurper's behaviour
Strace this script to see which syscalls are being invoked
This reveals that a writer closing it's fd will cause readers to read an EOF (and probably exit in the case of the block under the context manager).
So we have two options:
1) Ugly and kludgy: Wrap the context manager read block within an infinite loop the reopens the file:
2) Super cool trick. Open another dummy writer to the FIFO. The kernel sends an EOF when the last writer closes it's fd. Since our dummy writer never closes the fd, readers will never get an EOF if the real writer closes it's fd.
The actual root cause: The syslog daemon was being restarted and this would cause it to close and reopen it's fds.
Wednesday, January 22, 2014
Macbook pro setup for office use
Funky prompt thanks to powerline and powerline-fonts. Powerline can integrate with vim/ipython/bash/zsh…
- http://powerline.readthedocs.org/
- http://powerline.readthedocs.org/en/latest/fontpatching.html
- https://github.com/Lokaltog/powerline-fonts
I seem to prefer zsh over bash these days (git integration, rvm integration…):
In zshrc: ZSH_THEME=“agnoster”
Plugin support
Theme screenshots.
Vim has a very cool set of plugins thanks to spf13:
If you have a mac, iterm2 rocks:
And finally, I like the solarized theme for my terminal:
Subscribe to:
Posts (Atom)