Akash Trehan

SpamSlam - A blockchain solution to less spam

2018-06-06T08:20:00+00:00

Note: A post elaborating on my experience at this hackathon was published here

This project was done as part of Hack InOut, a 30 hour hackathon held at Bengaluru in Fall 2017. There was a blockchain track and we had never explored this trending technology before, so we figured we would do something in this space. However, our idea can be implemented, and perhaps in a better manner, without blockchain as well. With due gratitude to Gnosis who sponsored the first prize that we got in the Blockchain Track, and whose platform we used in the hackathon implementation, I would discuss the idea independent of an implementation.

Spam prevention is costly. According to a study in 2012 by American Economic Association, it costs the world $50 billion and earns the spammers about $50 million. This is one of the more modest estimates. The study also reported the cause to be the ease of sending spams, and suggested a heavier negative penalty would do the world good. Drawing from one of Vitalik Buterin, the Ethereum co-founder’s many crazy ideas, we wondered what if sending emails did cost us? And then one wonders isn’t this the same as putting stamps on letters? And perhaps one can paste a costlier stamp if one wants their message delivered faster?

Consider additionally the culture shift that we had with the exponential growth of telecommunications. Calling someone was the way to go and it was much quicker than sending letters that take a long time to come back to you, but as it turns out, we prefer it be less instant and we let the other person take their time to reply. On similar lines of a cultural oscillation, perhaps we do not prefer to have sending emails for free. If we have a system of digital stamps, imagine what more could we achieve. If someone has paid for a costlier stamp, then the message will be delievered to you faster. The urgency is reflected in the cost you pay for it. For friends and family, it would be easy to have a setting where they don’t have to pay anything to contact you, but for everyone else, you end up having a really effective filter.

There is one major challenge. How do you shift our civilisation to work on this model from the current one? This, we do have no clear answer for, and if we did, we would be off implementing that. Apart from this, we spent a long time thinking of other loopholes with the plan and we were able to shoot down everything that we did come up with. As part of the hackathon prize, we also won office hours with YCombinator, who also did not see more problems with the idea than the one which I’ve already highlighted. I would be delighted to have more discussion on this if you wish and are welcome to write back to me.

You can find the full presentation here.

Thanks to Kumar Ayush and Kumar Ashutosh for working with me on this project. Also thanks to Kumar Ayush for letting me borrow this post from his blog.

HackInOut 4.0 Winners - My First Blockchain Project

2018-06-06T08:15:00+00:00

My Other Computer is Your Computer - Malware Classification

2018-05-06T00:00:00+00:00

Source: https://www.cyberpointllc.com/images/svcs/img-mare.jpg

Background

Why this project?

This project emerged for fulfilling a requirement of the Machine Learning course (EE 769) I took this semester. As I have always been interested in computer security, I wanted to combine my newly learnt knowledge from this course with it. The attacks last year from malwares like WannaCry, NotPetya and Bad Rabbit had made me curious about how these attacks could be prevented. Malware Classification was the perfect project. For this I teamed up with my good friend Mukesh Pareek who is also a security enthusiast. This post is written in collaboration with him. Our guide for this project is Prof. Amit Sethi.

Malware

Source: http://thepcworks.com/malware/

Wikipedia defines malware as:

Malware, short for malicious software, is an umbrella term used to refer to a variety of forms of hostile or intrusive software, including computer viruses, worms, Trojan horses, ransomware, spyware, adware, scareware, and other intentionally harmful programs.

The definition tells us that there are many classes of malware. These categories are made based on how the malware propagates as well as what its intent is.

The Problem

On hearing about malware classification two things come to mind:

Separating malware from benign files
Given a malware, identifying which class the malware belongs to

We focused on solving the second problem.

But why classify malware into classes? Just like a doctor can treat you better if he/she knows what disease you have, anti-malware softwares like antiviruses can defend better if they know the class of malware they are dealing with.

Challenges

Earlier malware was detected using signatures. So whenever there a new malware was found, the companies created its signature and any file whose signature matched was detected. Since this method could only find exact matches, it was very restrictive. Later people shifted to identifying “indicators” which were the defining properties of a class of malware. So if a file had certain indicators it could be classified to the corresponding class. But this again uses only known indicators. With new methods to obfuscate and polymorphism techniques, the creators of these malwares were easily able to get across this layer of protection. The problems we face today is that millions of samples of malware spread everyday. Most of these are duplicates or slight modifications of one another made to decieve the defense systems. Classifying malware thus becomes a daunting task.

Why Machine Learning?

In today’s data rich world, machine learning has become ubiquitous. With its ability to find useful pieces of information from data, machine learning has lead to great results. The defense against a malware attack depends on the broader category of malware and not necessarily on the specific attack sample. This is why machine learning can be used. It can find hidden relationships among the various features of the samples and then leverage those to classify unknown samples.

More specifically…

Now we come to what exactly is the information we have about the malware and what categories do we need to classify it to.

For every malware sample, the input we have is:

.asm file - This contains the assembly code for the malware program and can be used to extract information about instruction calls, segments etc.

Snippet from asm file

.bytes file - This contains the hexadecimal representation of the file’s binary content. It can be used to extract infomation about the lower level functioning of the malware.

Snippet from bytes file

So these two files need to be used to classify the malware into the following 9 families:

Ramnit
Lollipop
Kelihos_ver3
Vundo
Simda
Tracur
Kelihos_ver1
Obfuscator.ACY
Gatak

The dataset

The specific problem as stated above is taken from a malware classification challenge organised by Microsoft on Kaggle. The dataset was also taken from there. The dataset contained 200 GB of training data and 200 GB of test data. Since we didn’t have the labels for the test data, we divided the training data itself into two parts one of which we used for testing purposes. The testing part was half the size of the training part with training having 7221 samples and test having 3648 labels. This divison was done randomly but ensuring that enough members of each class were are a part of both the training and test set.

Each malware sample had a 20 character long ID. We had a csv contatining the ID to Class mapping of the training samples.

Preprocessing and Feature Extraction ¹

The features we used for classification are as follows:

Instruction n-gram from .asm file - We extracted a list of instructions from the .asm file and used the count for each instruction (1-gram) and instruction-instruction pair (2-gram).
Byte n-gram from .bytes file - We used the hexadecimal representation to extract the byte sequence of the actual malware. Then we used the 1-gram and 2-gram count as our features
Segment Size - We store the number of lines in each of the segments - Header, Data, Text etc. This information is extracted from the .asm files
Pixel Intensity of .asm files - We converted the .asm file into an image and then extracted the last 1000 pixels of the image as features

Our intuition behind using instruction n-grams was that samples from the same class of malware should have similar code and hence there should be similar instructions sequences present in the code. n-grams were a way to represent that. Likewise for the byte n-grams. Using segment size is again based on the intuition that the amount of static data, the amount of space required for the code would be similar for the same class.

Implementation details

Extracting the above features involves text processing and parsing. For this we used the pyparsing python library. The library can be used to specify token formats which make it easier to identify the required instructions or bytes. For getting an image from the .asm file we used byte arrays.

For speeding up the feature extraction we used the ProcessPoolExecutor from concurrent library which made sure that all the cores were being used for processing.

After extracting the features we dumped them to a file so that the processing need not be done again.

Write about feature selection

Training

We used the following models/techniques for learning:

Support Vector Classifier
Xtreme Gradient Booster
Logistic Regression
K Nearest Neighbour Classifier
Random Forest
Neural Network

For each of these models we did hyperparameter tuning to find out the best model. Grid search was used to try out all combinations for the values of hyperparameters. We used k-fold cross validation with k=4 for training. To make efficient use of our CPUs we did the grid search in parallel since training of each hyperparameter combination is independent of the other. We used sklearn and xgboost libraries to help us with training.

Evaluation

Hyperparameter Tuning

The graphs for hyperparameter tuning are as follows:

Support Vector Classifier

K Nearest Neighbour Classifier

Logistic Regression

Xtreme Gradient Booster

Random Forest Classifier

Cross Validation and Test Set Accuracy

Model	4-Fold Cross Validation Accuracy	Test Set Accuracy
Logistic Regression	0.9745187647140285	0.910562449264865
Support Vector Classifier	0.9775654341503947	0.869346629
Neural Network	0.941	0.893875612342112
K Nearest Neighbour Classifier	0.9641323916355076	0.821231293817848
XGBoost	0.9945990859991691	0.921231623812763
Random Forest	0.9609472372247612	0.88658497372

We find that we get very good cross-validation accuracies with all models but XGBoost works the best.

XGBoost still dominate all the other models in case of test set but Logistic regression and neural networks also come quite close.

Problems faced and Learning

What did not work is as important as understanding what worked. This section talks about the challenges we faced during this project and what we learned from them. Firstly was the number of features. We wanted to take higher n-grams but the number of combinations were too many leading to very slow training. To get across this hurdle we decided to use Random Forest feature selection so that other models need not train on all the features but only the most important ones.

We were also trying to account for loops in the .asm files while getting the instruction counts. But since we can only do static analysis of the files, we could only follow unconditional jumps which would not have been very useful.

Since we were trying out various techniques we hadn’t used before, we decided to apply semi-supervised learning. But later we learnt that it is used when we have a small amount of labelled data and a large amount of unlabelled data. Then we also use the unlabelled data for learning. Since we didn’t have any shortage of samples, we decided not to do this.

We also wanted to try out Deep Learning but due to the large size of the files (~100 MB for many of the .asm files) it would have been very slow without extracting features manually first to decrease the size.

A major problem we faced was the huge size of the data. We didn’t have enough space on our computers to store all the training data so we had to store it on a server and then run all our code there. After doing this a few times, we came up with the idea that we should just dump the features after extracting them the first time. Then we can read directly from the dumps. This reduced the size from 200 GBs to ~1GB! We thought we were done but then we ran short of another resource - the RAM. All the features from all the data did not fit inside the RAM. A better idea at this point would have been to do batch learning, but we ended up just training on a smaller amount of data due to lack of time.

We learnt a lot about practical ML lessons during the project which increased our understanding significantly.

Conclusion and Future Work

We got good enough accuracy with the data and the low computational resources we had. Thus we can conclude that machine learning can be an effective technique for malware classification. Infact it is extensively being used in industrial applications these days.

Inspite of all the success, machine learning models aren’t full-proof too. The datasets used to train the models are usually biased because there is no common data sink for malware samples. This is caused by the lack of collaboration in the industry.

In future, we would like to try out more models and try more combination of features to find out which ones work best together. We will also make a web front-end for the application where people can upload malware samples and in the backend we use our models to predicts it’s class. This would make this project a complete ready to use package for the users.

References

[1] Malware Images: Visualization and Automatic Classification

[2] Code Obfuscation and Malware Detection

[3] Microsoft Malware Clasification Challenge 2015

[4] Feature selection and extraction for Malware Classification

[5] Kaggle challenge first place team

Our code is available on github here ↩

Graphics - Modelling, Rendering and Animation

2017-12-31T08:15:00+00:00

The Right way to use Sublime Text

2017-12-23T00:00:00+00:00

After having used sublime text for about 2 years the wrong way, I have finally learnt my lesson. I am going around the internet looking for ways to be more productive with Sublime Text.

Sublime Text is a swiss army knife with all forms of tips and tricks up it’s sleeve. These include but are not limited to keyboard shortcuts, creating projects and code snippets.

I will keep listing down interesting things I find so this blog will be updated regularly for a few days. Remember to try out these tricks hands on as you read through them else they won’t get registered in your mind.

So let’s get started!

Next Occurance of a word cmd + d (Mac) | ctrl + d (PC)
Multi-cursor cmd + Left moust click | ctrl + Left moust click
Column selection alt + left click drag | shift + right click drag
Split selection into lines cmd + shift + L | ctrl + shift + L
Move cursor to beginning of line cmd + left arrow | home
Wrap selection with html tag ctrl + shift + w | alt + shift + w
Move line vertically cmd + ctrl + arrow | ctrl + shift + arrow
Duplicate Line cmd + shift + D | ctrl + shift + D
Delete Line ctrl + shift + K | ctrl + shift + K
Indent line cmd + [ or ] | ctrl + [ or ] (Also Edit -> Line -> Reindent for indentation of selection)
Paste & Indent cmd + shift + V | ctrl + shift + V (Very very useful!)

Again, do try all this out by yourself!

Cheers!

If you are an Infosec person, don’t forget to checkout my CTF Write-ups

See other Blog posts

Hacking Postgres Internals - Indexing Schemes for Data Recording Systems

2017-12-13T08:20:00+00:00

Project Report

Team DataAcids
Harshith Goka
Akash Trehan
Abhishek Kumar
Tarun Verma

For the backstory on this project read this first.

The code for this project has been open-sourced on Github.

Introduction

Every minute, 600,000 pieces of content are shared on Facebook, and more than 100,000 tweets are sent. And that does not even begin to scratch the surface of data generation, which spans to sensors, medical records, corporate databases, and more. With such a high amount of data being stored, viewed and analysed, a demand for high performance comes as a must. Hence, the need of the hour is that the data should be stored and retrieved quite efficiently without the performance being compromised.

The reference paper for the project can be found here.

Objectives

Design a technique that supports both insertion and queries with reasonable efficiency, and without the delays of periodic batch processing.
Implement this on top of PostgreSQL, one of the most popular open-source DBMS.

Functionalities

Insertion of new tuples into the relation along with updating the corresponding stepped-merge index
Search using the custom index we implement

System Architecture

Front-end

There is no real front-end we will implement. It is just the user interface that PostgreSQL provide.

Back-end

Implemented a structure similar to Log Structured Merge trees(Stepped Merge Trees) to organize the incoming data on the basis of clustering by search key.
- Worked in single-user mode
- Not handled concurrency control and recovery issues
Implemented in the C language.
Used Eclipse IDE for debugging and building the project.

Engineering details:-

Our goal was to maintain multiple indices(runs) for maintaining the actual index. Only one of the indices(run) would be in the memory at a particular time and would act as an index for the latest incoming data. After this run fills up the memory it is written to the disk using B-tree bottom up build. Both the in memory run and the one just constructed are Level -1 runs.

We have implemented a stepped-merge algorithm as suggested in the paper. There are two parameters to the algorithm K (denoting number of maximum number of trees at any level) and N(Number of levels). When K runs of level i accumulate on disk, we merge them to create a single i+1 level run. When finally a N level run is reached, we write it to the root relation.

To go about this task, we firstly have to make Postgres recognise that we have created an index. Firstly we need to add an entry into pg_am system catalog to identify our 'smerge' index as an access method. This is done by adding an entry to the file pg_am.h and giving it a unique OID and the name of the handler(which would be created next).

    DATA(insert OID = 9399 (  smerge        smergehandler i ));
    DESCR("stepped merge index access method");
    #define SMERGE_AM_OID 9399

To be useful, an index access method must also have one or more operator families and operator classes defined in pg_opfamily, pg_opclass, pg_amop, and pg_amproc which allow the planner to determine what kinds of query qualifications can be used with indexes of this access method. Hence the corresponding entries are added in the corresponding files.
Next, a new access method directory is created in src/backend/access (called ‘smerge’ in our case) . Inside this directory we create a file called smerge.c(corresponding .h file is also created in src/include/access) and define the handler function that returns IndexAmRoutine with access method parameters and callbacks. Various parameters are set in this handler regarding the kind of support our index provides. For example, amroutine->amcanorder is set to false indicating that the ordering is not yet supported with the index. All the functions from nbtree.c are retained (names are changed according to our convenience) whose definitions would be changed complying to out requirements. The basic idea was to use the functionalities of nbtree by calling them from these functions or using their ideas as much as possible as we were merely building multiple versions of them.
For building a new smerge index, smergebuild() function is used which is tailored to create btree index statement, and executed it giving a unique OID to that index. The in-built function DefineIndex() (Defined in indexcmds.c, that created a new index given the index creating statement and other parameters)was used for this. Also in this function, we needed to add the metadata corresponding to each binary tree. The metadata that has to be inculded is defined in the struct smMetadata defined as follows:-

    typedef struct SmMetadata {
            int K;
            int N;

            int attnum;
            AttrNumber attrs[INDEX_MAX_KEYS];

            int levels[MAX_N];
            Oid tree[MAX_N][MAX_K];

            int currTuples;

            Oid curr;
            Oid root;

            bool unique;
    } SmMetadata;

The metadata is stored by first allocating a page of size of a block defined as BLCKSZ and then calling functions _sm_init_metadata() and _sm_writepage() which are defined in the file smmeta.c.
- _sm_init_metadata() is used for the purpose of initialisation of the metadata values. We have hard-coded the values of K and N here.
- _sm_writepage() uses similar functions as used by the storage module of postgres specifically smgrwrite()to store the metadata onto the first page of the smerge index relation.
As the OID of the newly created index is stored in its metadata page using smgrwrite() function it would be easy for us to get the btree using index_open() on the stored OID easily.
Next part is to insert an index tuple into the current btree.
- For this we first get the metadata of the relation and then extract the OID of the current in-memory b-tree using _get_curr_btree() function which simply uses the index_open function to get that b-tree.
- Once we have this btreeRel, we simply call the bt_insert() function for inserting the new tuple, followed by closing the opened index using index_close() function.
- Next we need to check if the current in-memory tree is full. If yes, then create a new in-memory tree using _sm_create_curr_btree() function and calling sm_flush() to flush the values into the next level.
- Finally we also need to write to the metadata page the changed values as a new tuple was added and the current count of number of entries the in-memory tree has changed. So we again call the function _sm_write_metadata() to update the meta-data.
smsort.c contains the implementation for merging the indices which involves creation of spools for various indices and then merging them. The main function called when it’s time to merge is the sm_flush() function.
We also need that after creating the smerge index, all search queries go through this for debugging. Hence, as a hack we have changed the smergecostestimate() function and set the costs very low(Close to 0).
Now once one of the levels is full, and it’s time to merge the k runs, sm_flush() is invoked which is responsible for merging the k level i runs into a single i+1 level run. The function’s implementation is inspired from the function bt_load() of the file nbtsort.c, which merges two spools (the second one is for dead tuples).
For creating the spools we need to get the tuples corresponding to each index separately. For this we do an index only scan the get all tuples for the particular index. Then we create a Scankey such that all the tuples are returned. Currently we assumed the entries being greater than a particular number ( we can use the smallest integer which fits in an int for this). After creating the spools, they are sent into the tuplesort_performsort() function. Although the spools are already sorted, the sortstate needs to be setup properly which is done by the given function. Merging of level N-1 into root is handled separately but uses a similar merging logic.

Run Through

K = 3, N = 3, max_tuple_per_index = 4

create table foo (uid int, name varchar(20)); # Create a sample table create index sm on foo using smerge (uid); # Creates the smerge index

insert into foo values (1, 'axzagd');
insert into foo values (2, 'axzagd');
insert into foo values (3, 'axzagd');
insert into foo values (4, 'axzagd');
——————– Memory index fills up. A new index1 is created and the filled index goes to level 0
insert into foo values (5, 'axzagd');
insert into foo values (6, 'axzagd');
insert into foo values (7, 'axzagd');
insert into foo values (8, 'axzagd');
——————– Similar index 2 is created
insert into foo values (9, 'axzagd');
insert into foo values (10, 'axzagd');
insert into foo values (11, 'axzagd');
insert into foo values (12, 'axzagd');
——————– Similar index 3 is created. Level 0 fills up. Index 1, 2, 3 are merged to create a level 1 index.
insert into foo values (13, 'axzagd');
insert into foo values (14, 'axzagd');
insert into foo values (15, 'axzagd');
insert into foo values (16, 'axzagd');
——————– New level 0 index is created and so on.
insert into foo values (17, 'axzagd');
insert into foo values (18, 'axzagd');
insert into foo values (19, 'axzagd');
.
.
After N-1th level fills up, it is merged with the single root relation.

Further Work

We had hard-coded the parameters N and K into the code which could be kept as user-parameters which could then be changed later on.
The cost operations are to be implemented properly
As of now postgres choosed btrees for the default indices(primary key, foreign key etc.). Changes need to be made so that smerge is chosen.
Currently, for search queries, we are starting our search from the root relation moving upwards which may not necessarily produce outputs in sorted order(which might be desired in certain situations). In short, the ordering property is not supported and the step to output tuples could be modified to sort before giving output.
There are memory(specifically relcache memory leaks) leaks which were not properly handled in the code which should be properly handled before doing performance improvement tests against btrees.
Update and Delete operations are not yet supported in the project which we have implemented. Once the order by operation is handled, these could be done efficiently. In addition, bloom filters might be needed for performing these.

Resources

https://www.postgresql.org/docs/9.6/static/xindex.html (Prequel for the below) https://www.postgresql.org/docs/9.6/static/indexam.html https://www.postgresql.org/files/developer/internalpics.pdf https://www.pgcon.org/2016/schedule/attachments/434_Index-internals-PGCon2016.pdf

*All mentions of B-tree actually refer to B+ trees

Database course project and how I almost ditched it!

2017-12-12T08:20:00+00:00

Background

This was the best semester ever! All the courses I took were Computer Systems courses (except Psychology, which is another subject I love). I had 3 labs which were great fun, and the cherry on the cake was this databases project I took up with three of my friends.

We were given the freedom to choose whatever project we liked, which more often than not is a responsibility …aagh another responsibility!

We were given some sample projects we could take up, most of which were Android apps. Their main focus was software development and understanding how to design database schemas. Most of the teams came up with great ideas for this type of project but as usual my rebellious self kicked in.

“This is the only course project you’re doing this semester! It must be something different, something awesome!”

The first step was to convince the team to take on a hard project. Since the project counted for 30% of the course marks, not being able to complete it would be devastating for our grade. The team had some discussions and by the end all of us were pretty excited to take on the challenge. We knew it was a risk but we did it anyways.

We talked with our guide, Prof. Sudarshan S and decided on the project you’re reading about. “Hacking Postgres Internals” had a nice ring to it I thought. The project actually implements a part of his paper from 1997.

The Team

Harshith Goka, Abhishek Kumar and Tarun Verma were my teammates. I have teamed up with Goka a few times before. He’s very enthusiastic above software development and has always been a great teammate. With Abhishek, I had done the Digital Logic Design project before and we became good friends since. I had never teamed up with Tarun before but knew he was a sincere guy. We really enjoyed doing the project together!

The Preparation

When we started, we had little idea about what we’d gotten ourselves into. We didn’t have a lot of idea about postgres internals. So it was a long ride to successfully adding a new feature to it.

Prof. Sudarshan provided us with a lot of helpful material on the subject and on our request even agreed to take a session explaining the basics. We attended the session, learnt new stuff, sincerely decided to start on it the next day itself and then forgot about it for a few weeks :P

When we finally got to it we had forgotten everything from the session, so we started all over again. We went through the material slowly but steadily. After finishing the reading, it was time to start implementing. We decided to start on it the next day itself. You know what happened after. We didn’t start until after our final exams :P

(The preparation material is mentioned at the end of the project report)

To be or not to be

A problem with adding a feature to an existing project is that you have to spend time understanding the existing code. It is exhausting but there’s no other way. We spent a lot of time on this during our preparation. Like a lot of time. A lot I mean. We ourselves hadn’t added much to the code. It’s not a very good feeling. So much effort but nothing concrete to show. I’ll be honest, I started having doubts if we would be able to complete the project. In fact, I discussed with the team and we decided that we would switch to a simpler project which we were sure to complete :/

We went to Prof. Sudarshan to tell him (read: ask permission from him :P) about our decision. I usually take lead in such situations and I knew it was going to be awkward (and sad). I started by telling him about our pain of not feeling a sense of progress. He was very positive and told us ways to take the project forward. He was so excited about the project and talked about it so passionately that I just wasn’t able to tell him we were planning to switch.

So no permission, no switch.

We got our heads back into postgres and determined to complete it.

And we did end up completing the project. It was lengen…wait for it…dary. Legendary!

The Project Report

All the engineering details are mentioned in the report.

The report is in another post here.

The code for this project has been open-sourced on Github.

Do check out other projects, my blog or my write-ups for various CTFs.

CSec - Binary Exploitation 2

2017-11-08T00:00:00+00:00

The second episode of my Binary Exploitation series is out!

(The first one can be found here.)

In this one I talk about some more advanced exploitation techniques, mitigation stratergies used against buffer overflow attacks and how to bypass them. There’s a lot of stuff this time. Infact, it’s about double the length of the previous video.

Don’t miss the demo at the end!

Constructive criticism is much appreciated.

Cheers!

If you are an Infosec person, don’t forget to checkout my CTF Write-ups

See other Blog posts

Won Ubisoft GameJam 2017

2017-10-18T11:00:00+00:00

With high hopes and curious eyes we entered the Ubisoft Office. We were in Pune for the final round of Ubisoft’s GameJam.

Let’s rewind back a month. Ubisoft had announced a qualifier round for the GameJam. We were required to form teams and submit game ideas. The top four ideas would go on for the final round. Since I try to participate in every hackathon with my team Ferozepurwale, this was our next target. After a good amount of brainstorming we came up with about five game ideas. We voted on them and decided to submit a role-play game. Since this post exists, you already know that we made it through.

Coming back to the final round…

The theme given to us for the hackathon was Flood!

Fortunately one of the ideas we had in our mind touched upon the theme. We decided to go 3D with Unity. I had never used Unity before but my teammates had.

A good game needs a good backstory. After working on the backstory and overall idea of the game, we got started. (BTW do take a look at the official Unity3D’s tutorials - they’re great!)

We had about 30 hours to complete our game - the graphics, level design, characters and an overall immersive experience.

The philosophy of level design was one of the best things I learnt during the hackathon. How to introduce the features of your game, what upgrades to add in each level and how to increase the difficulty without making it impossible… it was all really cool.

After two days of coding, free food, hot chocolate and some great mentoring from Ubisoft we complete our game and also ended up winning the hackathon!

Here’s an aftermovie of the hackathon by Ubisoft:

The code for our game is open source and available here.

Thank you for reading!

Cheers!

If you are an Infosec person, don’t forget to checkout my CTF Write-ups

See other Blog posts

CSec - Binary Exploitation 1

2017-08-25T00:00:00+00:00

CSec is the cybersecurity club of IIT Bombay started by me a few months ago. I have two aims in mind for the club:

To spread awareness about various technical/non-technical stuff related Computer Security
To build some strong teams for Capture the Flag competitions

Although the school year remains very buzy, I try to give as much time as possible towards this end. Aligned with this goal, I decided to start a vodcast series on Binary Exploitation with help from the legendary Web & Coding Club.

Presenting the first video from the series (This is my debut video; go easy on me :stuck_out_tongue_winking_eye:) -

Liked it ??? Didn’t like it ????? Let me know!

Constructive criticism is much appreciated.

Cheers!

If you are an Infosec person, don’t forget to checkout my CTF Write-ups

See other Blog posts

Akash Trehan

SpamSlam - A blockchain solution to less spam

HackInOut 4.0 Winners - My First Blockchain Project

My Other Computer is Your Computer - Malware Classification

Background

Why this project?

Malware

The Problem

Challenges

Why Machine Learning?

More specifically…

The dataset

Preprocessing and Feature Extraction 1

Implementation details

Training

Evaluation

Hyperparameter Tuning

Cross Validation and Test Set Accuracy

Problems faced and Learning

Conclusion and Future Work

References

Graphics - Modelling, Rendering and Animation

The Right way to use Sublime Text

Hacking Postgres Internals - Indexing Schemes for Data Recording Systems

Project Report

Introduction

Objectives

Functionalities

System Architecture

Engineering details:-

Run Through

Further Work

Resources

Database course project and how I almost ditched it!

Background

The Team

The Preparation

To be or not to be

The Project Report

CSec - Binary Exploitation 2

Won Ubisoft GameJam 2017

CSec - Binary Exploitation 1

Preprocessing and Feature Extraction ¹