- AI Homework
- Evaluated Switches to order for mptcp work
- Racked Servers
- Networked Servers
- Read some modeling papers
- Setup server for health project
- Established dates for IBM internship
- Met with Don and Prashant about MPTCP work
- Converted videos for the CS677 course
Last weekend we had quite a bit of snow fall, by my estimates over 15 inches. As a result, Monday morning was a delayed opening. This was a nice way to start the week.
I spent most of the first half of this week catching up on my AI homework that was due on Tuesday the 14th. It was a problem set on heuristic search. While an interesting subject I failed to leave myself enough time to get all the way through it. I managed to complete a solid 85% of the homework. Hopefully I do better on the next assignment.
Don and Prashant Meeting
I met with Don and Prashant to discuss the status of our MPTCP work and in which direction to proceed. We discussed different directions and decided to divide my time between the modeling work for edge clouds and the spark data center experiments. We finally decided on a topology that looks like the picture above. We use simple 1g switches and then aggregate them with 2 10gig switches. The redundant paths should hopefully allow MPTCP to function at its best and improve some of our research. We agreed to try and replicate some of the experiments in the IBM HotCloud’16 paper about Spark and the [Ir]relevance of network bounds. We also briefly discussed some possible resources for the modeling work.
New Servers Saga
The saga with the new servers continues. I was able to rack the servers on Wednesday in a little under an hour which is a new record. I hooked up one server so that our group can use it for an unrelated health project until their server comes in. On Thursday, I got impatient waiting on our new switches, so I ventured back to the server room and hooked up the rest of the machines. I hooked up 8 servers directly to the main cluster network and then used two 8 port 1G switches to connect their two secondary interfaces. I also configured the servers with the MPTCP routing tables, and configurations we normally use. I went back to my old system admin toolbelt and brought out tmux-cssh to configure the machines. It’s a tool that opens up terminals on all the machines and then runs the same command on each one. I was able to configure the machines in way less time!
Towards the end of the week, I began running experiments on our spark infrastructure. I started with PageRank but a single run over the 1G livejournal dataset took over an 3 hours. Due to a lack of large shuffle phases, I decided to move on to the next workload. I decided on TeraSort and was able to successfully complete a 20G Terasort quite easily. With the test case out of the way. I moved onto a larger data set that resembled the workload in the IBM paper. As of this post, it is still running. I will provide an update on the results next week.
I settled on dates for my Internship at IBM Research. It appears that I will be at IBM from May 22nd to August 18th. Here’s to a summer in New York!
What’s up for next week?
- Run some of the experiments from the IBM paper.
- Finish putting together presentation on the modeling notes from YC
- Evaluate possible future directions for the mptcp work.