Lexer Builder - Problem
Build a lexer that tokenizes a simple programming language. The lexer should identify and classify different types of tokens from source code.
Token Types:
KEYWORD: if, else, while, for, return, int, float, boolIDENTIFIER: Variable names (letters, digits, underscore, must start with letter)NUMBER: Integer or floating-point numbersOPERATOR: +, -, *, /, =, ==, !=, <, >, <=, >=DELIMITER: (, ), {, }, ;, ,WHITESPACE: Spaces, tabs, newlines (usually ignored)
Return an array of tokens, where each token is a string in format "TYPE:value".
Note: Skip whitespace tokens in the output. Process tokens from left to right.
Input & Output
Example 1 — Basic If Statement
$
Input:
sourceCode = "if(x == 5)"
›
Output:
["KEYWORD:if","DELIMITER:(","IDENTIFIER:x","OPERATOR:==","NUMBER:5","DELIMITER:)"]
💡 Note:
Tokenizes: 'if' as keyword, '(' and ')' as delimiters, 'x' as identifier, '==' as operator, '5' as number
Example 2 — Variable Declaration
$
Input:
sourceCode = "int count = 10;"
›
Output:
["KEYWORD:int","IDENTIFIER:count","OPERATOR:=","NUMBER:10","DELIMITER:;"]
💡 Note:
Tokenizes variable declaration: 'int' keyword, 'count' identifier, '=' operator, '10' number, ';' delimiter
Example 3 — Expression with Float
$
Input:
sourceCode = "result = 3.14 * radius"
›
Output:
["IDENTIFIER:result","OPERATOR:=","NUMBER:3.14","OPERATOR:*","IDENTIFIER:radius"]
💡 Note:
Handles floating point number '3.14' and mathematical expression with identifiers and operators
Constraints
- 1 ≤ sourceCode.length ≤ 104
- sourceCode contains only printable ASCII characters
- Keywords are case-sensitive
- Identifiers start with letter or underscore
Visualization
Tap to expand
💡
Explanation
AI Ready
💡 Suggestion
Tab
to accept
Esc
to dismiss
// Output will appear here after running code