Taint Tracking Through UTF Extension - York University

Taint Tracking Through UTF Extension - York University

Taint Tracking Through UTF Extension by Boe Zekan supervised by Dr. Mark Shtern, Dr. Vassilios Tzerpos Computer Science and Engineering Faculty York University funded by NSERC USRA Grant

Topics To Be Covered Some threats from user input Taint tracking Previous work

Our work Topics To Be Covered Our work Unicode Implementations Results The Problem We Are Addressing Estimated that > 80% of web services contain

security vulnerabilities 1 Many of these (50 to 82%) are user command injection vulnerabilities 1 [1] Chin, Erika, and Wagner, David. Efficient Character-level Taint Tracking for Java. In Procedings of SWS09, November 13, 2009, Chicago, Illinois, USA. ACM 978-1-60558-789-9/09/11 Our Goal Reduce security vulnerabilities that may occur

when dealing with user input User input: - input from an actual physical person - input from another program, file, database, etc OR - any data that is not a literal constant in our program or has not been generated by the manipulation of literal constants in our program Some User Command Injection

Threats: SQL injection Cross-site scripting (XSS) Path traversal Shell injection attacks, http response

splitting, ... SQL Injection query = "SELECT * FROM students WHERE name = '" + studentName + "'"; SELECT * FROM students WHERE name = 'bobby' SQL Injection

From: Exploits of a Mom webcomic at http://xkcd.com/327/ SQL Injection query = "SELECT * FROM students WHERE name = '" + studentName + "'"; SELECT * FROM students WHERE name = 'bobby'; DROP TABLE students; --' Cross-Site Scripting (XSS)

html="

" + name + "
" + when + "
" + comment + "

";

Anonymous
0 Hours Ago
Have you noticed that Soros spelled backwards is still Soros? Coincidence, I think not!

Cross-Site Scripting (XSS) html="

" + name + "
" + when + "
" + comment + "

";

Anonymous
0 Hours Ago

Path Traversal filename = "/srv/www/users/bobby/" + filename; filename: /srv/www/users/bobby/myhomework1.doc

Path Traversal filename = "/srv/www/users/bobby/" + filename; filename: /srv/www/users/bobby/../cse3000/tentativetestquestions.doc /srv/www/users/cse3000/tentativetestquestions.doc To Prevent the Propagation of Malicious Data

Possible solution #1: Carefully parse/sanitize/analyze all data being sent to a sensitive data sink SELECT * FROM students WHERE name = 'bobby' SELECT * FROM students WHERE name = 'bobby'; DROP TABLE students; --'

Anonymous
0 Hours Ago
Have you noticed that Soros spelled backwards is still Soros? Coincidence, I think not!

Anonymous
0 Hours Ago

/srv/www/users/bobby/myhomework1.doc /srv/www/users/bobby/../cse3000/tentativetestquestions.doc

... and hope that you catch everything from among all the possibly combinations, and don't discard any valid requests To Prevent the Propagation of Malicious Data Possible solution #2: Carefully parse/sanitize/analyze all user supplied data being sent to a sensitive data sink SELECT * FROM students WHERE name = 'bobby' SELECT * FROM students WHERE name = 'bobby'; DROP TABLE students; --

Anonymous
0 Hours Ago
Have you noticed that Soros spelled backwards is still Soros? Coincidence, I think not!

Anonymous
0 Hours Ago

/srv/www/users/bobby/myhomework1.doc /srv/www/users/bobby/../cse3000/tentativetestquestions.doc ... and hope that you catch everything from among all the possibly combinations, and don't discard any valid requests

Taint Tracking Makes Possible Solution 2 Taint tracking consists of three main steps: 1. Identifying untrusted input at the point that it enters the program and marking that it is untrusted (i.e., tainted). 2. Propagating the taint information At each subsequent computation, mark as tainted all data that is derived from an untrusted source. 3. Checking all data going into sensitive data sinks (e.g., a database, or output response, or file)

Use the taint information to identify potential attacks. Taint Tracking Taint tracking comes in two possible flavours: 1. String level mark the entire string as tainted 2. Character level - mark individual characters as tainted

- allows for finer granularity How Can Character Level Tainting Be Achieved? One method, by Chin and Wagner, of USC Berkley Expand the structure of the Java String class to include a boolean array which stores the taint status for each character in the string. [1] Chin, Erika, and Wagner, David. Efficient Character-level Taint Tracking for Java.

In Procedings of SWS09, November 13, 2009, Chicago, Illinois, USA. ACM 978-1-60558-789-9/09/11 1 The Chin and Wagner method Their achievement: Implementing a solution which minimizes the need to rewrite existing application code while transparently decreasing the vulnerability of applications to threats tracking

Their shortcomings: Specific to Java Increases the memory required to store a string in Java

The taint status of the java char primitive cannot be determined Not readily adapted to other programming languages Their taint information cannot propagate onwards to a database, or an application, script, or procedure running in another programming language. How can character level tainting be achieved? Our method: Expand Unicode to include tainted characters

Our achievements: Implement a solution which minimizes the need to rewrite existing application source code while transparently decreasing the vulnerability of applications to threats. Is not specific to Java Does not increase the memory required to store a string in Java The taint status of the java char primitive can be determined

Is readily adapted to other programming languages The taint information can propagate onwards to a database, or an application, script, or procedure running in another programming language What is Unicode? A scheme that assigns a codepoint to each character in current use throughout the world Has been implemented in XML, Java,

Microsoft.NET, web browsers, databases, and modern operating systems. Unicode Can accomodate 1,114,112 codepoints in 17 planes of 65,536 characters each Most of the codespace is still unassigned Mechanisms (ex. UTF-8, UTF-16 ...) exist that already allow software to manipulate

and store all these codepoints even if no characters have been assigned to them Our Design, Part 1 Tainting & Propagating Taint We create a tainted character for every character and assign it an unused codepoint Ex. Untainted

(ascii: 41hex) A (Unicode: U+0041) (ascii: 7Ahex) z (Unicode: U+007A) Tainted A (Unicode:U+E041)

z (Unicode:U+E071) Now wherever a characters codepoint goes, its tainted or untainted status goes with it Tainting Algorithms To taint a user input character x: __codepoint(tainted x) = codepoint(x) + OFFSET To check if character x is tainted or not:

if (codepoint(x) is in tainted codepoint range) ___character x is tainted //is user supplied else character x is untainted To remove taint from tainted character x: __ codepoint(x) = codepoint(tainted x) - OFFSET Our Design, Part 2

The Transparent Protection Framework Consider a typical vulnerable web application: Designing The Added Transparent Protection Framework Consider a less vulnerable web application: Users OS has fonts which incorporate tainted characters Request Intercept Wrapper uses custom taint aware classes/functions and is generic for a given technology

Application is on a server w/taint awareness built into its library functions Database Driver Intercept Wrapper uses custom taint aware classes/functions specific to the database to check for SQL injection, and drop malicious queries Implementation Details: The Font For a final, universally adopted application: System fonts would be expanded to include tainted characters, which would look identical to their

untainted counterparts Ex. untainted ABCDE ... vs tainted ABCDE ... For our proof of concept: Tainted vs untainted character appear different to easily distinguish them on computer screens and in documents Ex. untainted ABCDE ... vs tainted

... Implementation Details: The Font We used Type-Light freeware to modify Window's Courier New font - installed it by dragging out the original ttf file from the Fonts directory, and dragging in our new ttf file Implementation Details:

The Application Has no knowledge of taint Counts the number of visits of this user 1st query to db checks if users name is in the db. If no, then insert name into db and sets visits count to 1 If yes, then increment visits count by 1 in the db 2nd query to db outputs the number of visits for the users _name from the dbs record

Implementation Details: The Transparent Protection Framework We implemented our framework on our typical web application in four different technologies: 1. PHP/Mysql on Apache (under Windows XP) 2. PHP/DB2 on Apache (under Linux) 3. Java Servlet/DB2 on Tomcat7 (under Linux) 4. PHP on Apache (under Linux) calling Java Servlet/DB2 ----on Tomcat7 (under Linux)

To do this we set the UTF-8 or Unicode encoding option everywhere it was available, and Courier New as the selected font wherever possible. Implementation Details: The Transparent Protection Framework Implementation Details: The Form Page

Implementation Details: The Transparent Protection Framework Implementation Details: The Request Intercept Wrapper Two versions were used: 1. PHP version which uses cURL to interact with the application 2. Java Servlet version which uses a connection to interact with the application

Both versions handled both the post and get requests. Browser only sees wrapper's url, never the application page's url Both will work with any form, no matter the combinations of controls Implementation Details: The Transparent Protection Framework

Implementation Details: PHP Application & Db Driver Intercept Four applications exist - essentially the same code with minor variations Two Database Driver Intecept Wrappers exist - essentially the same code with minor variations - they are php include files

- each file has taint aware functions that wrap the _query and fetch array functions of their respective _databases Implementation Results: PHP Application & Db Driver Intercept Was not totally transparent - application needed modification to specify the include files, and rename two functions

But we did successfully: - propagate taint from user input all the way back to the user output - transparently detect and stop SQL injection - show our method work on different databases and different operating systems - produce an easy to implement solution to increase the security of legacy programs

Implementation Results: PHP Application & Db Driver Intercept Implementation Results: PHP Application & Db Driver Intercept Implementation Results: PHP Application & Db Driver Intercept Implementation Details:

Java Application One application, reachable in two ways Has modified String & Character classes that will not break application at ("A").equals(" ") or ('A').equals(' ') Implementation Details: Java DB2 Database Intercept Wrapper Is a collection of custom taint aware classes

The original ibm.db2.jdbc.app.DB2Driver class is wrapped with our taint aware Db2DriverIntercept class We then drill down and also wrap the Connection, PreparedStatement, and ResultSet interfaces and augment their existing methods to provide transparent SQL injection protection Implementation Results: Java Application & Db Driver Intercept

Was not totally transparent - application needs to call our driver instead of the IBMs database driver But we additionally showed that our character level taint method could: - work on different programming languages (php and java) and paradigms (procedural and OOP) - propagate between different languages and different servers

- could be handled transparently by modifying Javas String and Character class operations Application Breaks & Work Arounds Java: the char is a primitive if ('A'==' ') is as far as we can keep taint information accurate Thereafter, taint information is lost no further propagation - if allowed to alter source code then replace ('A'==' ')with taint aware custom method

('A'.equals(' '))to allow taint to propagate even further within an application. Application Breaks & Work Arounds php: strings are considered primitive if ("AB"==" ") is as far as we can keep taint information accurate Thereafter, taint information is lost no further propagation - if allowed to alter source code then replace ("AB"==" ") with taint aware custom method

(("AB".equals(" "))to allow taint to propagate even further within an application. NB! If our method were to be adopted universally, the above could be overcome by modifying the JVM or PHP engine Other Possible Uses of Our Character Level Tainting Method Tainting and tracking of multiple input sources there are a lot of unassigned codepoints

many tainted character sets could be created to indicate different data sources (ex. keyboard, file, database, remote login, ...) Storing tainted characters in log files to make user input immediately recognizable Tainted characters can be stored in a database & retrieved via using taint in queries Other Possible Uses of Our

Character Level Tainting Method

Recently Viewed Presentations

  • Panelists - World Cashew Convention & Exhibition

    Panelists - World Cashew Convention & Exhibition

    African Cashew Alliance (ACA) 09-11 February, 2017. Mrs VidyaKamath. Graduate in Business Management. Export Manager in family business . Bola RaghavendraKamath & Sons, India . House Manager. 09-11 February, 2017. Women in CashewHarvest - 80%. 09-11 February, 2017 … Trade,...
  • Kronos Supervisor Training Process A

    Kronos Supervisor Training Process A

    Schedules will need to be loaded by the Time Keeper for all employees. Non-Exempt - will punch in and out, accounting for time. Teachers and Classified Exempt - need a schedule to make Kronos work, but will NOT be using...
  • Better Than Worst-Case Design

    Better Than Worst-Case Design

    Better-Than-Worst-Case design Design-Time Verification and Optimization L H Time-to-Market L H Performance Run-Time Verification Typical Case Optimization L H Time-to-Market L H Performance L H Performance L H Time-to-Market Online Checker Hardware speculative instructions in-order with PC, inst, inputs, addr...
  • NOMENCLATURE(TATA NAMA)  Latin: nomen (nama) dan calare (

    NOMENCLATURE(TATA NAMA) Latin: nomen (nama) dan calare (

    Warna tubuh dewasa (Euglena viridis) berwarna hijau 5. Morfologi (Paramecium caudatum) berbentuk seperti sandal Macam Nama homonym is a name for a taxon that is identical in spelling to another such name, that belongs to a different taxon synonyms are...
  • Les négatifs et l'interrogation

    Les négatifs et l'interrogation

    Put in the negative present tense conjugation the following verbs. If you wish, you can add the direct object.Ex. Je ne mange pasde croissants.Be careful, in the negative sentence we always have de. Use these verbs in the negative sentence:...
  • Geriatrics for OSCEs

    Geriatrics for OSCEs

    Neuro examination. Cranial nerves - bells palsy vs stroke. Diabetic foot. OSCE stations. Explanations . Please take a history and explain to a relative what their mother has, and explain the next steps (delirium) Please explain Parkinson's to this patient...
  • Unit 4: Infant Physical Development - Family and Consumer ...

    Unit 4: Infant Physical Development - Family and Consumer ...

    Arnold Gesell - Physical Development Theorist. CRUISING. Fine and Gross Motor Skills. All students form a line and when it is their turn they are to make their way through the gross alligarotwswamp by doing something other than wlkiing through...
  • Bµi gi¶ng cho líp cao häc 1K - Yola

    Bµi gi¶ng cho líp cao häc 1K - Yola

    Khái niệm chức năng tổ chức Mục tiêu của công tác tổ chức Khái niệm về tầm kiểm soát/tầm quản trị Các cách phân chia bộ phận trong một tổ chức