Anybody remember the riscure embedded hardware
CTF a long time ago? Rhme2? I have a whole playlist covering various different challenges.
Two of them cover reverse engineering a binary that would be running on the arduino board.
Which means their architecture is AVR. And if you are not very familiar with reversing
embedded devices or AVR I would recommend watching these videos before continuing here.
This video will be about another reversing challenge, but explained by Zeta Two. I didn’t
solve that challenge, but he was really excited of sharing this write-up with us. This video
was planned for a long time and because I still don’t have a good process for working
with other people I screwed up the first time we recorded this. So thank you for sitting
down with me a second time. So I’m carl, or on the internet known as
zeta two and I play a lot of CTFs. But before we fully head into the technical
parts I want to highlight something. I know from the time when all of this stuff was new
to me, that reverse engineering and reading all of this assembler code and figuring out
what it does, seemed so impossible. How can anybody ever know and learn all of that. The
truth is, we don’t know all of it. It’s a mix of knowledge, experience, gut feeling,
research and guessing and assumptions. The over time all of these things get better,
more refined and you are able to understand more and more complex systems. And so when
you listened to carl I want you to pay attention to his language. Even when he is fairly certain
with something, he uses language like “it looks like”, “it could be”, “maybe”,
“if we assume”. And that makes sense because we are exploring an unknown target we have
to start somewhere and piece together the puzzle. Early on you can make guesses like
“this is blue, could be the sky in the top of the picture”, but maybe later you realize
it was actually part of a blue car. And so I think a key takeaway here is how important
it is to always make assumptions, try to picture the whole thing, but try to verify it somehow
and be always ready to accept that an assumption was wrong. And that is probably the key to
success. Anyway. Let’s head in.
FridgeJit, reverse engineering 400 points. A senior technical manager of a fridge manufaturer
demanded the ability t update the firmware. It turned out that the CPU that comes with
the fridges does not allow self-upgrading the firmware, so the developers built a VM
for the fridge software. A crafty customer has been able to reverse
engineer the software and programmed the fridge with different software. His goal was to build
a digital safe. Are you able to crack the password? We have
been able to extract the full firmware image of a slightly different fridge and a memory
dump of their fridge. We hope this is enough. And we get the challenge binary, which you
could flash onto the board, which is basically the same as the firmware.bin and the memory.dmp.
As you can see rhme2 had this neat map for the challenges. And here is FridgeJit.
This was one in a series of challenges about reversing a virtual machine on the AVR.
You can see that the path connects to more challenges. But let’s look at the first
one. So carl will now retell how he reverse engineered
and solved this challenge. And obviously it all starts with IDA.
The first start is to load this into IDA. So I’m using my old IDA db, so I have already
named stuff but we’ll go through how you can come up with these names and realize what
is what – like the reversing process. So IDA is a reverse engineering tool that
helps you puzzle together the pieces. It helps you by visualizing code flow and some other
stuff and you are able to add names for functions and add comments. So These function names
were not there at the start. Carl gave that name to those functions based on what they
do. And so he will now walk us through some important functions and how you can figure
that out. But some other names referenced, for example this memory location here called
io_UCSR0A is how IDA can help you. One thing that's very convenient . That probably
is the absolute first thing you should do when reversing some kind of embedded thing.
And I did this at some point but took me a while, is to get the memory mapping of the
device. So that you can name memory areas properly. So if you look at some datasheet
of the processor there will be a list of what memory regions are mapped to what function,
different registers have special meanings. What he describes here you can see in my AVR
reversing video in a bit more detail. But essentially small CPUs like this atmega have
to interact with other hardware. For example maybe it supports serial. And you know serial
is this high/low/high/low protocol at a certain speed and you don’t have to program that
yourself in assembler. The chip has this as a feature and al you have to do is to write
in assembler which symbol you want to transmit. And a certain memory address is mapped to
this hardware feature. So when you write a value to that address in assembler code, some
part of the chip, in the hardware, will receive that value and then perform the proper low/high
serial communication. That’s what mapped memory means. Very simple concept but very
powerful. And additionally the memory in general, like the ROM or RAM are also mapped to certain
memory ranges. And you want to know where is what in order to understand what is referenced
by addresses in the assembler code. And you can find all of this information in the manual,
the documentation of the chip used. And in IDA there is support for having configuration
files where you can define the processor and you can tell it all the memory areas and what
their names are. For the atmega328, which is the one used in the CTF, there isn’t
one in IDA, but I actually found a forum post where somebody had created one like this and
I just copied it into my IDA AVR config file. So this is a definition file for the atmega328
where you have names for all the different registers.
And the point of this is, when you load the binary into IDA you will have things named
from the start. If we go here you will have some memory areas named in IDA, and this is
very nice, because then you can look for interesting registers and look where they are used in
the code and you can start working your way backwards from there.
For example one very basic thing that we want to find out, is like how and where in the
code does it communicate. Like send and receive data on the serial port.
So If you look in the manual you will see that it uses this register to communicate
on the serial port, you can see where this is used you will find a function. And this
means that this is a function responsible for sending data on the serial port.
Wait, how do you know that it’s sending data here exactly?
It basically tells, “I’m going to send stuff” and then it loads data in this register
and then it gets send away on the serial. So basically we are looking for a place where
something is writing to this register. And also, conversely it receives data by reading
this register. So if we look where this register is used we find exactly two places. So one
place it writes data into the register, and one place it reads data from this register.
So basically we look at this function. It first checks the status of this register.
And basically what it’s doing here it’s checking that it’s clear to send data. That
it’s not busy. It just loops here until it’s ready to send. And then it sends one
byte. And the other way around, if we get the other
function, we have here. It loop until it’s ready to receive and then it receives one
byte. And returns that. So it felt very natural to name it io_receive and io_send.
And we can then see where this is used. And it’s used in a number of places. For example
this function. Which I have named print_text. So it is basically calling the io_send function
in a loop. So what this function does is basically it
takes a pointer to some data and then it loops over this data and sends one byte at a time.
And in the same fashion we have the receiving end. There are functions using this to receive
data. So that’s nice and we will check deeper on that track later I think. But for now we
can just go back to like where the program starts. This is also a thing that you get
from loading the config. You get the interrupt table, or what do you call it, so and the
reset one is basically the entry point for the processor. So this is where everything
starts running. Where the program starts. And this is basically like a standard thing
where it copies some static data in some location and clears a memory region.
So up until here, carl sounds like he knows exactly what he is doing. Like if he reads
this stuff every day. And yes, this is clearly coming from experience how programs and chips
work and being able to read the documentation, the manual of the chip and making sense of
it. But of course he doesn’t know everything. It’s a puzzle, other parts of the picture
take more effort and time until he realises it.
In the beginning I thought this is important, but then I realized this is just a standard
set up thing. It’s like the start function before the main function in a regular x86
program. So he had an assumption at the beginning which
much later turned out to be wrong. I mean I think I moved on but then I’ve
revisited it later. And I realized that what it’s doing, It’s copying static data from
the ROM into the RAM. So I think here maybe. Yeah. Basically the RAM is empty when you
start and then the program loads basically global variables and constants and things
like that. So it’s a loop copying data from the ROM into the RAM at a certain location.
And then there is this part which is just zeroes out basically the rest of the RAM.
So this is just setting up the whole ram region. And then you call a function. Which eventually
ends up. So this is more like the main function where interesting things start to happen.
So now that we know where the main function starts, Carl goes into the section here which
uses the very obvious print_text function. It’s not far fetched to understand that
the loading instructions before the call, load the data for the printing.
I mean if you look down here a little bit and look at these addresses. So in the ROM
we have some string constants embedded. Basically there are these strings related to laoding
a program into the machine. And there is something here that looks like a debug menu. And something
here that looks like a table of registers. So and here is FridgeJit Console. And so it
looks like all of this is related to like debugging this thing.
Remember that the challenge description said that the developers built a VM.
So this is some kind of Virtual Machine built on top of AVR so we could then guess that
there is some debugging feature in the virtual machine. Here you also have this interesting
thing with AVR. that it’s basically an 8bit processor, but it has 16bit addressing, so
you load some addresses you load them one byte at a time. So those are the two halves
of an address. Which is pointing to this string. Before these were named, this region already
looks interesting. Because it has this repeating pattern. Like load two bytes, call a function,
load two bytes, call a function. You can guess that this is actually doing something that
is interesting, it’s also the same function. Maybe it’s like setting up some data. Or
encryption or decryption. I mean I didn’t know before. So first of all it loads into
r24 and r25 registers. And you have this other peculiar thing with the AVR that you have
these regular numbered registers and you have these meta registers X Y Z, which are 16bit
registers that consists of a pair of registers. So basically if you look at this code you
can see that it takes this address stored in the r24, r25 register, it loops over it
and calls the io_send function. So you can name this function print_text. And then it’s
very natural to see what this partis doing. It’s printing out a menu.
So it says that we should provide a boot rom and then it prints out this prompt thing.
So a natural guess is that the next thing that it’s going to do is, take input from
the user. right? I think this is a good point to cut this episode.
We have used the atmega manual and an IDA config to load the binary, identify important
registers that are used for communication and traced the usage backwards, to functions
that clearly read or print data with it. And then we also went the other way around and
started at the real START of the processor and found the main function that is using
the print_text function from earlier. So, see you next time.
if you're in slack I'm going to go ahead and just post the presentation in your slack in the poster is citizen what's useful about the link is an each slide I do kind of give my reference material from a web perspective so you can actually just go right behind me and recreate the experience for yourself also I link all source code that I use with slide notes so a guy anything you see here today if it feels too rushed you can kind of come back to it at your own pace began in your slack banane in their citizen feel free to just ping me if you have any questions after this is over so I'm going to be balancing the screens quite a bit more material the guy in the presentation is going to be focused on portable executables and civically how they look within Windows memory the theme is it's going to be again a kind of a slow walk and hopefully we're picking up too fast runs or it's being able to deal with portal executables again if you enjoyed today's content I'd love to come back out and talk about how to deal with portable executables in memory from the program from a programming perspective anything you see today it's really it's my private work this is no way related to my employer this doesn't reflect what it is we do in our practice again and it's he's strictly mean talking to you guys so Who I am I thought I'd be a clever boy and do it like I was typing in a command line athletics guys you're familiar with that again my name is Peter star finger at federal boxing 13 or so years in the industry I managed to achieve a couple things academically from professional certification to degrees one time I did the math is as far as trying to figure out how many letters of the alphabet I acquired there's something like 86% of the letters so one day I'll get them all from a work perspective I currently serve as a cyber defense forensic analyst I need for my functional unit I also serve as president of the localize to say a chapter if you guys haven't heard of us it's a again a professional organization that's focused on promoting both contemporary knowledge on what's happening from an information security perspective as well as developing the knowledge and skill skills of its membership at this point I believe there's a $30 fee if you wanna join as a student and what you get with that $30 is that each month we kind of put on two major things one of them is our general session where we bring in a local person to speak about info second topics and of course we give you a lot of other enrichment like what's happening that month from InfoSec perspective a should probably be concerned what and then we also have a separate session that we call tradecraft and that's where we actually take you through a themed learning and maybe it's at the phone hacking meets exploit development there's always a different theme and then we take you into a learning environment more likely at CTF where again you'll get a chance to apply some of those skills moving on some other works that I've done I've had a chance to participate in some industry research there's a paper out there at the toasted nest it's a malware risk mitigation before it's actually for many years ago so maybe not so relevant today but as you go down the list there's some other work that I've done recently which again if you get an opportunity I'll share some other links at the end where you can go back and use some of that material because again maybe I'll spark an interest in you as far as considerations in your future professional choices as far as today's presentation again I'm going to go through the layout of the portal executable then I'm going to talk about what it looks like from a chronal memory perspective to have a process loaded into memory then I'm going to also go through a bit of a mapping exercise where I show how a file actually looks within memory again in a very friendly way it'll be pictures that sort of thing and then from there we'll move in some some kind of gettin easy stuff Python basics then we'll dig at again using Python in combination with other tools how you can go through the process of automating your ability to analyze a portal executable so let's go and get into this you can see affectionately I made this yeah that's right I art that's why I have a day job so kind of speaking to the art we're looking at here if again they're ever on your computer and you notice that again there's files with a dot exe extension again that is your portable executable portable executable conforms to a cloth format and has something called a DA stuff effectively what this means is if we're looking at the file or reading from the very beginning of the file to the end their specific byte values within the file that give it its distinct characteristics allowing the operating system to execute it correctly from a DA stuff perspective if you look at any dot exe file you'll find that the very beginning in the file has the letters MZ that part of the file is what we call the DA stuff later in the father's something called cop stuff which is a common object file format and again this has in the very beginning of it the letters PE followed by the bite the hex value zero and then another zero this section has pointers to another portion of the file that again follows eventually called a section table and the section table points to all the pieces parts that are typically needed in order to load that file into memory so it's actually fun you know after this there's typically resources resources are things like when you're again on on your computer you look at the desktop you'll notice that files resolve to a particular icon again that's all because of resources that are packed within the file and then after the resources there's typically something called an overlay this is the one piece that's not loaded into memory specifically very unique to each file and sometimes there's really interesting things there sometimes not now this is called an executive object in the Pacific burying of an executive object that I'm representing on the screen is called the process object and what's in the e process object is a lot of important information to the kernel because from a kernel perspective whenever a portable executable file gets loaded into memory there's a lot of statistics that the kernel needs to track in order to understand again other files that are memory potentially depending on this also things like a guy what they need files or other memory objects have been created by the executable in memory that may need to be closed up before the executable is closed down see in front of you kind of make sure get the direction here right kind of speaking for my right for the left is typically when he looked in kernel memory you're not something following a little header and what's important header is is that if we were to take just a physical snippet of memory the pool header also contains a unique signature kind of like when I was speaking about the Doss stub real have in the very beginning of the file the letters MC you'll also find in kernel memory in order to get to began the data structure that contains all the important information about the process again look for a an example for Windows 7 PR OC and that typically represents the beginning of a pool header for a deep process object structure other important information such as what kind of what kind of object structure will be within the file so this is mandatory these are optional and typically again the pool header will also lead you to an object header which will have all the other informational stuff you need to interpret again are we looking at a process the reason why I share this with you is a lot of times you can look for again four letters P ROC and potentially have collisions with other things that really aren't pool headers and so from a memory analysis perspective it's important to understand that there are multiple data structures that you probably want to validate exist in succession in order to know that you've actually found a legitimate structure within memory building a map a little more again the process data structure contains other useful information that includes things like when the process was created where is the sections of memory that are devoted to that process where can you find control information like what cpu registers are assigned to the block of memory we're all the stacks keeps a guidin if this is your first go hearing about things that are important from a computing perspective when code gets loaded into memory just know that within the process structure a lot of those things are being managed now moving to the right began I've annotated there's a section of memory that's devoted to kernel space this is typically where this e process data will exist and again include things like you know what CPU registers and that kind of minutia is needed in order for the executable work memory there's also a limit of shared information called the process environment block now give other critical pieces of information to the code that's actually running in user space in memory and that's that bit of shared space is called a process environment block again the other piece of that amount below this line again everything below this line represents what's available to user space includes things like again the very highest addresses you'll have the actual file itself loaded will have libraries that the file is dependent on and then kind of as we go lower we'll find all those segments that we mentioned earlier and I have another graph that kind of has arrows pointing and connecting these things but you'll find all the different segments been mentioned earlier that contain the pieces parts that actually makes the program function then as we get away from down the different segments from the file we'll see all the dynamic things start to come about like keep memory and stack memory and those kinds of things that allows a portable executable to function on a computer I'm going to hit this and I just want to show another representation of what's in the process structure just because we're going to dig hard out what horrible executable looks like but again today we really won't be doing coke Rama CLE with the memory again this is kind of what the mapping process looks like you know I mentioned earlier these arrows are looked like but I mentioned earlier that again there's this space that's kind of for the kernel exclusively then there's again the user space where the actual content of the file will get loaded you can see here really what I'm trying to show is that all those sections that contain code and variables and things like that again those are loaded into memory into different segments as they're defined within the file you'll also have again it's not represented here but that this original file is also loaded into memory which can be discovered which would include again your boss stuff your cost of again the section table just taking a look all the heap and staff segments that's actually typically derived from information that can be fine found within these stubs as far as what sizes are needed presentation quickly okay again this has been shared in the channel but what I want to show is one of the links is for nerve soft and this shows a lot of these kernel structures I think it's just a great resource because it's freely available on this page if you search for each process you get a closer look to see what's within this this particular kernel structure that represents a process in memory and again as I mentioned earlier there's a lot of statistical type things that are being tracked by the kernel but for us is the analyst what's interesting again I hope to get to this in a future presentation are things like when did the process first get into memory when did it exit memory one of the things that's interesting is there's for instance something called a doubly linked list and all those processes that are in userspace are linked up in kernel memory in this process structure through a series of pointers they make a long chain well when a process comes out of memory it's typically unlinked from the doubly linked list however earlier I mentioned to you there's a guide in this idea of a pool header and if you know what the pool header is that you're looking for you can find processes in memory that are again in an exited state and potentially recover again all the volatile artifacts that were associated with that particular running process so it's pretty useful again you can see here's the reference to the process links which is the doubly linked list again I mentioned earlier that there's also the VAT and so again from a kernel-space perspective you know the the operating system is tracking you know what physical parts of memory are signed to across us but from a process perspective a process when it's loaded into memory typically code expects that it's address space is continuous and not all over the place and unfortunately one day I should probably need to talk about this but unfortunately kind of kind of like a desk when physical RAM is available can usually be in multiple different physical locations from an addressing perspective and so again about our virtual address descriptor is the structure that's used to take all these disparate physical locations and bring them back together in one continuous addressing so instead of going 0 5 6 let's say those were the addresses instead you would get 0 through 3 as an example so again here's just kind of an example of that again you know here's maybe how physical memory looks in reality from a process perspective you know the the process things that everything it has access to is all just back to back and regular should be just like it's inside the fall right everything's just break back to back but again this is really the mess at the colonel's managing which is you know the space that's allocated could be all over so I'm gonna dive in some Python basins to kind of help warm us up the code that I've shared for this was written in Python 2 he tried in 3 you're going to spend a lot of effort having to correct things and so I hope through these so Python 3 version but for your purposes if you want to try out the code that's there again he's Python 2 again from a Python perspective help is a really valuable command effectively what health reveals to us if we are in Python so I'm just opening an interpreter and this is Lennox and then within Lennox again we can ask for help on different function calls and what you get is is something that looks like a man page and effectively it tells you exactly how to use a particular function call or a particular module that you've imported just another quick example there's a module called sis if you use the import command to bring in the Syst library as an example this library allows you to access arguments and so right here it's telling you that again if you want to access arguments somebody's passing on your command line program it tells you that hey this is really in an array or the again referencing position zero references the Python script itself and then anything that follows that will occupy other spaces in the array as an example and the point is that again if you use help you can find really informative stuff that the people who have contributed to Python have put together that have some other basics that are useful as that you can also use a type command and what type brings to the table is that let's say again we're coding and we're getting a lot of error messages and how it is we're trying to handle maybe a particularly write and you're not sure ladies you can do with that variable or what it is well you can use the type command to discover the exact data type that you're using and so it just as a very quick example again if I were to say test is equal to three now we all know that again that that's an integer we use an American value like that however let's say later in the program and trying to figure out where it is you can do with tests we may need just a quick reminder again type bool you know what kind of a primitive you're dealing with and if you compare that with things like help to find out what are what other actions you can take on that particular primitive so do you take advantage of that again that was type that I just showed you could use that against any variable that you've used within your Python code and it's just especially helpful when you're inside the interpreter and you're trying to figure things out in live again I very mentioned sis I started to so I did help on open but I just bring up open because you know what's what's interesting about open is that open the function call you can use without importing any other libraries and it allows you to interact with files on the file system and you can specify things like how it is that you read in the file and so as an example you know the code that I've shared I ask that the follow you read in as as bits and so when you look at the code you'll see in the function call that again I name the file then it's a comma R V which means read it in and is a visible screen also close is really important and the reason why close ends up being really and all your code is typically once you've loaded a file into memory you probably don't need that file anymore unless they plan on writing to it or something else for our purposes we're talking about interpreting files so just just to kind of keep things clean you know close it if you're not using it free up that space free up there and close those file handles that you've created also exit really key again we're going to have our code run gracefully and at the exit commands really useful you can see that in the bulleted items it's exit without any arguments so again when your codes done use exit to indicate that and you'll see that in the examples I showed you an integer I'm just kind of again show the ease of Python I'm going to show these three other data types and so from a string perspective if we put anything in quotations again that represents a string it all to do this Python twos with Fraga me one forgiving again this is just me showing that you know without specifying that I'm using a screen depending on what I set a variable to Python will automatically determine what kind of a variable what the type should be of the variable that you're setting again just like we showed earlier with integer you can also go in and ask for help on the type screen as an example and we can quickly see that with string here that there's a lot of function flaws that we can use to manipulate a string like here you know we have capitalized we can again count and see how many characters are within the string if we need to decode a string or encode it again those are all possibilities the point is is that if you as somebody who's first trying out Python don't know anything as far as how to work with data take advantage of its height and help so you know where it is you can go with again that particular primitive that you're dealing with just to show a list now it's cool about a list is that if you look at what I just did I put an integer and two strings inside of the list and so unlike more like if you were programming in C++ or C like those languages are really rigid what's nice about Python is that we kind of mix and match what it is we're putting inside arrays you're not going to get punished for that if you will in anyways again if you're leveraging a list again there's there's many things you can do with the list that are very comparable to what I showed you at a glance using a string in any ways you know and just here some of the examples you know we can reverse our list we can pop things from our list you know we can count how many things are inside of the list you know we think in terms of the list you know it functions a lot like an array again encode you wouldn't do that right you probably want to call the print look at the inside of your list you can do other interesting things like isn't example you know I could say give me everything up to you know the second position right where I could say starting from negative 1 up to 2 you know subjective one would be position zero won't position to on the list so I don't know if you guys are seeing syntax that I'm using there but by using : and positive or negative integers we can refer to all the different positions within the list and again even ask for specific outputs from the list that are going to be those propellent organs might have tiny lists that's not so interesting dictionaries get in just a quick example there so in this example it may be a little more complicated than it needs to be but the point is is that if to show you the goddness is a dictionary the point is we have a type dictionary what that means is is that I can do lookups and I could say hey show me you know what the key machine returns and whatever follows the colon is what value would be returned if I were to query for again the machine in this particular example and again if we just use help you'll see all the different ways that you can interact with you know a dictionary and then just from a comfort perspective I think it's really useful you know if you don't work with bits and bytes every day I think it really is worth putting some energy into understanding that you know the simplest way maybe that I could think to explain this is that when we think of a byte typically that's eight bits and typically one byte is equal to the hex value of some sort and so as an example you know if we were looking we look at that as an example so we can see MZ at the very beginning of the file that's a process executable and we can see that from a hex perspective 4d makes the m5 a makes the Z in order to we gotta get get this value from a bit perspective you know you would have to again have the right bits flipped in order to return a value forward so this this is actually from a binary perspective you typically would have for an example for placeholders right for a 1 or a 0 and kind of starting from right moving to left the very first position is typically if it's turned on to a 1 that would be a value of 1 the next position would be a 2 next position would be a 4 and final position would be an 8 if all those were ones you would have all those values up getting a whole number value of 14 or the letter F and so from a hex perspective the way that it translates is is that when you we start at 0 and once you get to 9 right again still a number in hacks the ones we exceed nine to go to the letter a and then 11 SP and then 12 SC and then we lose myself here but 13 is d 14 is e we back here again I'll show an example of function to returns here shortly so let me dig real quick into the example again this is available to you I'm going to show this very quickly so you can move on here again in the links there's a reference to a Wikipedia page and effectively on that page there's a complete breakdown of what a PE file looks like from a bits and bytes perspective really what I want you to see here is is that at the beginning again you see the five a four D that's the MZ Oz referencing and that's the start of the dass header and then within the dass header at position three see that is typically where we will find a pointer to where the PE the PE header is for the file which is the cost section that I mentioned earlier the way that in the example I'm going to show you the way that we represent that from a code perspective again I ended up defining a function called main and in Python again what what makes up a function as you start with again the app space whatever it is you want to call your function print the seas and then if there's any variables that you want to accept in your function call again you would put them between the parentheses make them comma separated again because of Python simplicity you don't have to define what kind of a primitive type it is by again saying oh this is an integer this is a string again it works very similar to what we did earlier whatever you set the value of that variable to is the data type at the variable assumes and then again in this particular example end up reading a variable that wind up being an integer that's the number 0 and then I end up making again a function call to something called make byte array ok and then I say from make vita' rate I'm expecting two values to be returned and so I have two variables ready to receive again whatever's returned by this function call make byte array the only thing we're doing here is that we're actually reading in a file and so again we have the same play that's talked about earlier again reason EEF to define a function like again this one's followed make byte array we're not expecting any arguments and then anything that follows falling again that syntax is very important it's going to be part of the function now what Python what's really important Python is the way we do our intentions so in this particular example it just happens to be for white spaces make up mind engine a lot of IDPs will take care of that for you if you don't have an IDE that you like to work and there's a free one well pycharm through the user friendly and it'll help you of all these attention so you don't have to remember things like that I use four spaces or five spaces or three spaces at my code but that being said though let's move on function and so for this particular function you know I'm printing message back to the screen and says building byte array and then I then go right into the business of opening up or defining a variable or a file that I'm going to open up the way that I handled this one particularly is earlier I mentioned importing cysts if we want to deal with arguments and so in the beginning of the code I have an import statement for cysts and that's the only library that I import for this particular code and then getting back to again the function call I look at cysts and I say hey give me argument one now again I'm not a programmer typically on you know if you work in security industry you write code that functionally does what it is you need to do for the job right like you're not writing code for thousands of people to use you're writing code to get the thing done right we got a digit dig a ditch again let's write code that does that very quickly for us so we can move on to the next thing we need to accomplish so in this scenario and I look at whatever that first argument was House passed to my Python script I turned that into a string and then I use my file to capture whatever that value is I then take my file I pass it to the open function and I say read in bytes again just like I mentioned earlier you know using the interpreter can be very very helpful when you're unsure of what your options are typically whether I'm in Windows or Linux I'll have multiple windows open so I can just get right into an interpreter if I need to validate live what it is I'm doing will actually work so I can actually again take code that I'm working on you know in in this example VI put it in there and then just see if Python yells at me if it doesn't yell there again the open function call if you want sure what is to do with that we can go right in to help and this isn't it Python three that the open is a bunch of friend in here than us but again in this example I'm reading in bits and then I use F to capture the fact that I've created now a reference to a file and remember to read in those bits then I create a list effectively fog lights read and then I tell Python to go ahead and try to read in again one byte at a time until all bytes in the file have been read now in Python 3 there's a little more handling that you need to do when you're reading in a file this will not work but for Python 2 purposes by meme saying continue executing this while we're not getting a null result that is sufficient to break the loop again just speaking a little bit to the indentation that's going on here so we have try and then everything that we're going to try again has to be indented and so I again create another variable called white to capture what it is I'm reading from the file again one bite at a time and I say while that white still has a value again : again we have another tation continue to take those bytes and append them to my list of bytes read which I defined up here again a list in order to add things to the list you have to use an impending function call and if you were to use help against a list that you've defined you can see these little subtleties on how it is you work with that particular datatype presentation guys are being very kind to me today that means they're not all right again moving along so we'll continue reading until we reach the end of the file and effectively with what this function is doing is it's populating a list now we can return back to the main function itself again going back to the function called main once we have that result we're able to then pass it to another function called fault check da stub they'll probably stop here at the leaves I mean a sense in the knesset to Phillip do this we're going to trying to go to see if the Dawson because the Doss stub is MC so if we go back up to that particular function call again there's some variables that I just set to some default value so they actually become something abused by me doing the double quotes with no value that makes header become a string as far as its tight I create something fault LOC for location I set that to the value zero now I'm making it an integer then I also use das header check and I do this equal to the brackets which therefore makes that a list from a Python perspective I then say that hey while location is less than four and then I add another condition and within lelou saying that while location is less than or equal to two we want to add to our daus header check whatever bytes are at that particular location and then the guy increment the location so we can get the next byte within the loop and then when our location hits a value of two we then want to do the following conditional loop and it says for every entry now what's interesting about a for loop within Python is that whatever follows the for this can have any name you want but the point of this is is just to capture whatever follows in so if what follows in is a list but we're going to do is go iteratively through the list until we reach the end of the list and each item in the list is its values can be assigned to in this case the variable call entry okay we then say header is equal to whatever the current value of header is remember earlier we set it to null we know it's a string so I'm just appending to the string each of the values that are inside of the list called das header check and then finally what i do is i encode header as pacs and then i say make sure it's uppercase and compare it to this 4580 because again you can go to the public reference we can see again this is the value the way that I'm handling it I'm reading it in as little endian and so on reversing the values but the point is is that this is what we expect that beginning in the file and that is what I'm looking for and that turns out to be true and I'm returning positive statement to the screen stating that hey this part of it is true and I exit the process called again dost agreed now there's other code in here which we welcome to look at where I go through reading some of the other stubs looking at the values they're interpreting are those values correct therefore proving to myself that this is a bona fide portal executable the only other thing I want to point out in my coding perspective again and you look at this on your own is that because I wrap everything up is a function the way that I kick the whole script off if you will is by calling main at the very end and that's just a small subtlety and that's useful and it's useful to you because you can take your code in moduli x' every function of your code and so when it comes to again troubleshooting or anything you know if you need to turn off a particular function call and quickly do that by going to your main function and just say hey you know what let's comment this one out and make sure I can you know execute up to this point you know little little adjustments like that so you can troubleshoot whatever issue you might be experiencing just to show this very quickly in action juice again this is what it ends up looking this an example is that people will accuse what you take for granted is a structure that's gonna be used just as it's supposed to be it may not actually turn out to be that way so understanding you know the structure of portable executables allows you to better interpret you know is this a legitimate program or is this potentially bad code everything I've just shown is just a guide me to working around with code and I recommend you do the same thing to again increase you know what your horizon is as far as using the Python language because it is used quite a bit throughout the industry and a guy even though I've shown you all this there is kind of a shortcut and that shortcut is you can use something called key file which is a module that you can import that actually exposes all these data structures to you to use however it is you deem necessary just to again show that if you go to the next slide and the slide deck again small Python mod will reuse will see me going through some quick examples on you well that loads up so what I did was I went out to the directory windows system32 and I just grabbed cmd.exe because you really don't need to deal with Mal code if you want to just be in the business of understanding how portable executables work and so that's what I'll use for this particular example again this is my term that I'm loading up I'm not going to go through the process of building a project but I do want to show that once you have your project built you have to add the PE file module to your list of libraries the way that you do that is after your projects built just go to files settings and then in settings you go to project interpreter and click the little plus sign click the plus sign again search for PE file there's a lot of apparently other versions of PP file one you want is the one made by arrow Ferrera you would just literally click install package now for our purposes because we're going through kind of a lot of work routes playing hey here's this chart you can look at and you can see the exact breakdown they read in bytes on another business that's great but gotten the P file you know we can skip a lot of that and so again in PyCharm we can go right to python council that's an interpreter that's actually gonna be part of the PyCharm install from there of the stood import a file and PE file is imported again we can use help to figure out how it is we use this module because I've already been through this it's easy for me to cut to the chase but again if you took the time to do the same reading I did maybe takes 10-15 minutes because of how well a lot of these modules are documented so for PE file there's a particular class called PE so the way that we would call a class from a module that we just used as we would say PE file doc whatever class were interested in in this particular class we can do is and again it shows you we can say hey you take a variable called PE set equal to again the module that we just imported PE file dot PP which now references this class and then we would just pointed out whatever follow you're interested at and interested in interpreting and so for our purposes I'll end up doing one of the nuances using this within windows that you have to escape all your slashes again as I mentioned earlier I just copy command I XE to the desktop so there it is now effectively command P X P XE is now encapsulated within this class as an object and now we can take actions on so if we weren't sure again but those actions were and this is how I'm going to save you from scrolling through all this but you know take advantage at all also in the coding example I have lots of little things you can try so earlier I mentioned or in the example I was showing reading you know as MC there again now you have a way to just quickly see you know what's in that header so again at a glance right we have interpreted check this is probably a list but we have you know what the magic value is in 584 d that matches 100% what it is we saw active image showing us again what we should expect beginning of the file building a look more on this you know to read in all the other headers without having to quote for those if we really just didn't have much time but you just knew that you're going to deal with every part of the file again you can just do a full dump of all its contents everything will be printed out for you I mean who again just using the command line we're able to again run type against any of those particular sub structures that are part of PE again if I were to read help against that gets you SIF again so the other thing that's kind of nice if you're using pycharm instead of them as an example is that you can also see both variables you create all the structures that are naturally the part of that object that he now stands eiated and so if you didn't want to go through like I was what I was trying to show just a second ago is that if I take my variable which I know is an object I had thought again the IP is automatically showing me what all my options might be as far as function calls or structure types that may want to reference again if that's not your desired experience using like this again you can go right to the right and you can see every variable you've created while you were coding within the interpreter and dynamically look to see what all the values and options are with that particular variable a guide moving forward I'm going to show you just a couple more tools a tool that kind of takes this weaponized as an example something called key scanner the e scanner if we were to look at the code you know we've been here for a while so look at it tonight if we were to look at the code again it's just importing a key file using all the built-in functions of that in order to again make a higher-level determination are you dealing with Mal code for those of you who have read malware cookbook this is PE scanner from our cookbook and again what it's giving back to us and this is everything you can get out of PD file is all the header info about the file itself we ought to get again what is the entry point as far as the actual address you're going to go to a new starting point for all the codes sometimes this can reveal really interesting things to you we also can see from a me perspective did we have any signature heads apparently family B thinks there's something funny about the file we can also see all the individual sections and see things like as an example something that's defined within all these headers are things like what should the size and memory need versus what is the size while it's still a file like like easy indicator for mouth though it is sometimes you'll have a physical size of zero but then you'll have this really large size within virtual memory and if that's ever happening you're probably dealing in the PAC file other things that are interesting that are being revealed here I think sample is something called entropy so again you get something called entropy and it's still looking like a cure to return there but the closer you get to a value of seven the more packed if you will code it so that's that sometimes an indication that maybe you're dealing with encrypted or trust data the point is is that again all the six is exposed to you through the key module and again as a in the security profession these might be understanding that you're able to read these things about the file again using a Python coding environment to better help you determine what's good what's bad what should you spend your energy on if you can imagine you know you could actually assess things in maps I didn't get to hxd just to flash that on the screen again please look at this in your own time hxd is a just text editor again it's available in the download links for you again it's just a hex viewer but what's useful about that is that again from a command line perspective you saw there was using a tool called xxd to view things impacts with hxb it's a visual xpm tool and so you can again if you're in the business of coding and you're trying to read things in is a binary screen it helps to be able to see that in a hex editor to go am I reading this incorrectly is that actually the value that I should be getting at this point in the coding process moving forward some other industry tools that are worth taking a look at again I provide links to all of these in slack Channel is PE Studio that kind of builds a whole lot and being scanner it's more of a graphical tool for Windows again of courses out all these different sections of a PE file they might be interested in a resource hacker again there's typically a part of a portable executable all the resources that may contain an icon or it may contain other executables in there what resource a pro gives you a chance to dig at those and see what those attached resources might be also process a core process actor allows me to see memory in a way that path manager doesn't expose it to you vitally we talked a lot about structures in memory I linked volatility framework that very nicely written wiki what's great about volatility framework is again if you're able to get a copy of memory you can then take that memory look at these kernel structures and understand all these little tidbits of information that I was sharing that find them windbg and Katie again if you wanted to do a kernel level debugging while you're looking at something live these are great tools for that kind of work also windbg because the nature of it downloads symbols for your operating system so you can just independently spend your time looking at all the different components of a data structure of different data structures that are within memory alright guys I appreciate your time again I'll try to drop some links week and find other useful information to do this on your own you know again if you have any questions definitely reach out to me this is Mike first go trying to present on this particular topic so I appreciate you guys listening through on it and again if you're able to privately meaning feedback I appreciate [Applause] you
what comes a moment forehead shocks today we look into PE resources how they are located in a PE with a person for instance and how the resources the meter information about the resources are structured so we start with the P file itself if you haven't watched the previous video about the basic P structure please watch that before this video I will put a link in the description below and yeah check that out first because I assume you know what I covered there so the actual resources or the starting point to find them is the optional header and the optional header has a so-called data directory the data directory is simply a list of entries which point to certain data structures they have addresses virtual addresses and they have the sizes of these data structures in the data directory and that's also where the resources or the resource table is located so the powers of a parse the data directory entry for the resource table and then it narrows where to find the resources itself so in our example we have two sections and I will mark the resource green I got used to bring meaning resources because that's the default kind of resources and products analyzer visualization so that's just the way I associated yeah so our resource entry in our case it will point to section one you so today when we yeah so we point section 1 it points to the start of the data structure in the data structure for the resource information is a tree so in our example we will have a tree with two resources every leaf of the tree is one resource basically and the path to it contains the meter information that is nice to know so we will do close up into the resource tree itself soon also you know we know that the section one is contains the resource tree so this section is a so-called resource section and our PE example and there's a convention for section names if it's a resource section it's usually dot R s RC or dot our data but well you can always violate conventions these names are for humans so well but usually doesn't care ok now the closer into our resource tree I actually tried to draw a tree as you know trees in computer science are they grow from the top to the bottom so the root of the tree is this that's the root in the air and then we have our basic structure here yes now on Windows there is the convention that every tree has three levels and there's the meaning to every level so level one would be the type of the resource so let's say that's level one the root level two is the name of the resource and level three is the language of the resource so you might have different languages for if you have a text resource you might have different versions of that depending on the language and yeah the type says well there there's a fixed number of types but it will say whether it's an icon and image or version information or something else so the name directory okay almost done the name directory has a name identifier or a name pointer so if it's a pointer it points to an address of a string a Unicode string the string can be anywhere on the file and the parser needs to know how long that string is so it will start up the length of the string and then it will there will be the actual unicode string which is the name of the resource so the language directory it has a language identifier every ID stands for a certain language there are also some tables out there where you can look them up with usually if you have paz' it will interpret this for you so most of the time you don't need the tables but more importantly the language directly has a data entry pointer and the pointer points to a small data structure the so called data entry which determines the size and the location of the actual raw data for the resource so let's quickly complete this for the other resource as well now with the actual raw data that's a green green one here and the data entry says how large it is and where it starts in the file can be anywhere in the phone and yeah indicated by ones and zeros so that's the raw data right here it depends what it is so if it's an image it's an image if it's text it's some text there can be anything could also be another part of X could have a phone because you know some but another executable in there and well same for the other resource so we have our two resources here and as I said the type directory the type directory has one entry for every type that exists for all the resources and in our case we have two entries so there are two different types for each of the two resources and let's fill this out by example on the right side we have an icon so we say the resource type is RT icon and again there there are some tables with the IDs and the corresponding type and let's say that's a hedgehog icon we won't say our name is Hedgehog and it has 8 characters and here's the actual icon so that's it already I think you I hope you understood now how this works let's see you next time thanks for watching