04 - Media Files and Digital Forensics
Class: CYBR-405
Notes:
Module Objectives
By the end of this module, you should be able to:
- Identify different types of media files
- Summarize data compression and obfuscation
- Define data-hiding techniques
- Explain how to locate and recover media files
- Explain digital evidence validation and discrimination techniques
- Describe an examination plan
Images
Recognizing a Graphics File
Graphic files contain digital photographs, line art, three dimensional images, text data converted to images, and scanned replicas of printed pictures
- Bitmap images: collection of dots
- Vector graphics: based on mathematical instructions
- Metafile graphics: combination of bitmap and vector
Bitmap vs Vector
- Bitmap: just a mathematical algorithm to say: "make me a circle and fill up that with a color gradient"
- Vector: It looks smooth the whole time
/CYBR-405/Visual%20Aids/Pasted%20image%2020260202093518.png)
- Bitmap looses its integrity when you zoom in
Understanding Metafile Graphics
Metafile graphics combine raster and vector graphics
Example
Scanned photo (bitmap) with text or arrows (vector)
Share advantages and disadvantages of both types
When enlarged, bitmap part loses quality
Standard graphics file formats
Common file formats
- Portable Network Graphic (.png)
- Graphic Interchange Format (.gif)
- Joint Photographic Experts Group (.jpeg, .jpg)
- Tagged Image File Format (.tiff, .tif)
- Window Bitmap (.bmp)
- Raw file format (.raw)
- Referred to as a digital negative
- Raw format maintains the best picture quality
Understanding Graphics File Formats
Nonstandard graphics file formats
- Targa (.tga)
- Raster Transfer Language (.rtl)
- Adobe Photoshop (.psd) and Illustrator (.ai)
- Freehand (.fh11)
- Scalable Vector Graphics (.svg)
- Paintbrush (.pcx)
Audio and Video File Formats
- Audio and video files come in many different formats
- The most common current audio and video file formats include:
- Apple QuickTime Movie (.mov)
- Audio Video Interleave (.avi)
- Motion Picture Expert Group (MPEG)
Identifying Unknown File Formats
- Knowing the purpose of each format and how it stores data is part of the investigation process
- The Internet is the best source
- Search engines
- Find explanations and viewers
- Popular Web sites
- FileFormat.info
- Extension Informer
- The Graphics File Formats Page
Viewing and Examining Media Files
- In addition to Windows Photos, other media viewing programs include the following:
- FastPictureViewer Pro
- FastStone
- Irfanview Graphic Viewer
- VLC Media Player
- When working with Windows OSs, the digital forensics examiner may find additional evidence by examining the content of thumbnail (thumb.db) files
- On digital cameras and smartphones, photo and video files are typically stored in a Digital Camera Image (DCIM) folder
- DCIM is part of the Design Rule for Camera Format system (DCF)
- The DCF folder and file structure recommendations are as follows:
- Subfolder names have three numbers followed by five letters
- Graphic file names have three letters with four numbers followed by the file type extension
Notes:
- You might find digital evidence by examining the contents of the thumbnail
Examining the Exchangeable Image File Format
- Exif format collects metadata
- Investigators can learn more about the type of digital device and the environment in which photos were taken
- Viewing an Exif metadata requires special programs
- Exif Reader, Exiftools, IrfanView, or Magnet Forensics AXIOM
- Exif file stores metadata at the beginning of the file
Notes:
- Metadata = data about data (about the file itself)
- Stored at the beginning of the file
EXIF (Exchangeable Image File Format)
- Also Known As (AKA)
- Properties
- Metadata
/CYBR-405/Visual%20Aids/Pasted%20image%2020260202094312.png)
- Camera Manufacturer
- Camera Model
- Date/Time (photograph was taken)
- Exposure Time
- ISO Speed
- GPS Information (when available)
- and more.
Notes:
- We can add fields to the EXIF data and store that information as well
EXIF Information
/CYBR-405/Visual%20Aids/Pasted%20image%2020260202094423.png)
Image Properties
/CYBR-405/Visual%20Aids/Pasted%20image%2020260202094531.png)
Notes:
- There are web tools to know the location from a given Latitude, Longitude, etc. given from metadata
- PIC2MAP
Understanding Data Compression
-
Data compression is the process of coding data from a larger form to a smaller form
-
Graphics files and most compression tools use one of two data compression schemes: lossless or lossy
-
Lossless compression techniques reduce file size without removing data
- Based on Huffman or Lempel-Ziv-Welch coding which uses a code to represent redundant bits of data
- Utilities: WinZip, PKZip, Stufflt, 7zip, and FreeZip
-
Lossy compression compresses data by permanently discarding bits of information
- Vector quantization (VQ) is a form of lossy compression that uses complex algorithms to determine what data to discard based on vectors in the graphics file
- Utility: Lzip
-
Lossless compression produces an exact replica of the original data after it has been uncompressed
-
Lossy compression typically produces an altered replica of the data
Notes:
- If you lose those compressions, that image may be lost forever
Steganography
What is Steganogrpahy
Steganós + Graphia
(covered or concealed) + (writing)
Notes:
- Osama Bin Laden was sending images with steganography hidden
- He was sending secret messages with images
Steganography in Graphics Files
- Two major forms of steganography are insertion and substitution
- Insertion places data from the secret file into the host file
- Hidden data is not displayed when viewing the host file in its associated program
- You need to analyze the data structure carefully
- Hidden data is not displayed when viewing the host file in its associated program
Inspect Images
/CYBR-405/Visual%20Aids/Pasted%20image%2020260202095041.png)
Notes:
-
Does Gmail show you images rendered in an email?
- Those attachments can have hidden pixels
- With these we can know if someone check that email
-
Substitution replaces bits of the host file with other bits of data
- Least Significant Bit
/CYBR-405/Visual%20Aids/Pasted%20image%2020260202095358.png)
- If we change just the last bit or two and store a secret message on those two bits, probably a difference won't be notable.
- The more math you change, the more notable it would be
Can you tell the difference?
/CYBR-405/Visual%20Aids/Pasted%20image%2020260202095621.png)
- Use steganalysis tools (also called "steg tools") to detect, decode, and record hidden data
- A steg tool can also detect variations of the graphic image
- When done correctly you cannot detect hidden data in most cases unless you compare the altered file with the original file
- Check to see whether the file size, image quality, or file extensions have changed
- Clues to look for include the following:
- Duplicate files with different hash values
- Steganography programs installed on the suspect's drive
YouTube channel for CTF challenges: Marty Carlos
Notes:
- Where can we have files that the OS can't see?
- Slack space of the hard drive
- Is more like free space
- In graphic files
- Some people call it steg tools
- The file should work just fine, the user can't know the difference
- You should remove the steg tools after using them
Identify Media File Fragments
- Recovering any type of file fragments is called carving, also known as salvaging outside North America
- Many digital forensics programs can carve from file slack and free space
- Helps identify image files fragments and put them together
Repairing Damaged Headers
- When examining recovered fragments from files in slack or free space, you might find data that appears to be a header
- If header data is partially overwritten, you must reconstruct the header to make it readable
- Compare the hexadecimal values of known media file formats with the pattern of the file header you found
- Each graphics file has a unique header value
- Example: A JPEG file has the hexadecimal header value FFD8, followed by the label JFIF for a standard JPEG or Exif file at offset 0x06
Notes:
- JPEG starts with particular first 2 hexes that identify a JPEG file
- You can see this just from reading the file
- There are tools to do this
Rebuilding File Headers
Before attempting to edit a recovered graphics file
- Try to open the file with an image viewer first
- If the image isn't displayed, you have to inspect and correct the header values manually
- Steps
- Recover more pieces of file if needed
- Examine file header
- Compare with a good header sample
- Manually insert correct hexadecimal values
- Test corrected file
Notes:
- Simply use tools for this!
Understanding Copyright Issues with Graphics
- Steganography has been used to protect copyrighted material
- By inserting digital watermarks into a file
- Digital investigators need to aware of copyright laws
- Copyright laws for Internet are not clear
- There is no international copyright law
- Check the U.S. Copyright Office
- U.S. Copyright Office identifies what can and can't be covered under copyright law in U.S.
- Fair use
- Another guideline to consider
Cases
Twitter catches the w0rm(er)
The FBI busted "CabinCr3w" hacker and Anonymous hacktivist Higinio O. Ochoa III, aka "wOrmer," after he posted a photograph of his bikini-clad girlfriend holding handwritten taunts to the FBI. But it's what the photograph wasn't revealing that led to Ochoa's takedown, pleading guilty on hacking charges in June 2012, and subsequent 27month sentence in federal prison. Namely, the photograph had been snapped with an iPhone, and the feature to automatically add EXIF information, including GPS coordinates, to photographs hadn't been disabled. Furthermore, the EXIF data hadn't been expunged before being posted to Ochoa's "AnonwOrmer" Twitter account for the world to see.
Anti-Virus doesn't stop everything
Eccentric antivirus founder John McAfee, who was fleeing his home in Belize, where he was wanted for questioning in a murder investigation, had his location in Guatemala inadvertently revealed when Vice reporters traveling with him posted a picture of McAfee that included GPS-coordinate-revealing EXIF data. In short order, Guatemalan authorities arrested McAfee, who was ultimately returned to the United States.
Notes:
- Have you heard about the McAffee antivirus?
- Is he a good or bad guy?
- He killed himself
- He last lived in Spain
- The FBI was looking for him for a long time
- Was sought of for committing fraud
Selfie solves a murder
On March 24, 2015, 18 -year-old friends Cheyenne Antoine and Brittney Gargol from Saskatchewan, Canada posted a selfie on Facebook. They headed out for the night, but later, Gargol was found dead on the side of the road, with a belt nearby. While Antoine initially stated that Gargol left with a man she met that night, police determined that her story was untrue after looking at surveillance video. The selfie posted the night of Gargol's murder came into play when authorities noticed that Antoine was wearing a belt similar to the one found at the scene. Eventually, Antoine admitted to killing Gargol by strangling her with her belt during a drunken argument.
Summary
- Three types of graphics files
- Bitmap
- Vector
- Metafile
- Image quality depends on various factors Standard file formats: .gif, .jpeg, .bmp, and .tif
- Nonstandard file formats: .tga, .rtl, .psd, and .svg Some image formats compress their data
- Lossless compression
- Lossy compression
- Digital camera photos are typically in raw and EXIF JPEG formats
- Recovering image files
- Carving file fragments
- Rebuilding image headers
- The Internet is best for learning more about file formats and their extensions
- Software
- Image editors
- Image viewers
- Fair use allows using copyrighted material for noncommercial or educational purposes without having to compensate the material's originator or owner
Scenario
University Police receive a disk image as digital evidence. Multiple student analysts examine the evidence and successfully identify the relevant artifact. However, questions are raised later about whether the findings would hold up in court.
Your task is to evaluate the scenarios below and determine which actions cause the most damage to the integrity and admissibility of the evidence.
Scenarios
Scenario A
A student mounts the disk image read-write by accident but quickly realizes the mistake. They still locate the correct artifact and document their findings.
Scenario B
A student runs a forensic tool that automatically modifies file timestamps within the mounted directory during analysis.
Scenario C
A student copies only the “interesting files” off the disk image and performs analysis on those copies instead of the full image.
Discussion Questions
In your group, discuss the following:
- Which scenario causes the most damage to the chain of custody and evidentiary integrity... and why?
- Scenario B looks like the one that may cause the most damage since it actually messes up with the data (in this case it can be the logs or metadata timestamps of files), because of this, the integrity of the evidence is compromised.
- Are any of these scenarios recoverable from a legal or procedural standpoint?
- Scenario A is more recoverable from a legal/procedural standpoint, because it just takes some permission changes and proper procedures in place to remediate the mount. It is still possible to prevent integrity compromise in this scenario.
- What is the minimum documentation or corrective action that could potentially salvage the findings?
- There needs to be well established procedures when dealing with digital evidence artifacts. All of these mistakes need to be remediated in the possible measurement and logged correctly to keep track of exactly what has happened to the artifact. Student A needs to change the disk image mode to read only, student B needs to somehow look at a snapshot or back up of the artifact to see if its changes are recoverable, and Student C needs to perform analysis in the full image, as that would make the evidence more admissible.
- What would you do differently next time to prevent this issue entirely?
- Next time, students need to follow proper procedures to deal with digital evidence. They must always safeguard the integrity of the artifact by keeping a chain of custody and ensuring the data does not get tempered with by unauthorized procedures or bad practices.