Ruby 3.3.2p78 (2024-05-30 revision e5a195edf62fe1bf7146a191da13fa1c4fecbd71)
pm_parser Struct Reference

This struct represents the overall parser. More...

#include <parser.h>

Data Fields

pm_lex_state_t lex_state
 The current state of the lexer.
 
int enclosure_nesting
 Tracks the current nesting of (), [], and {}.
 
int lambda_enclosure_nesting
 Used to temporarily track the nesting of enclosures to determine if a { is the beginning of a lambda following the parameters of a lambda.
 
int brace_nesting
 Used to track the nesting of braces to ensure we get the correct value when we are interpolating blocks with braces.
 
pm_state_stack_t do_loop_stack
 The stack used to determine if a do keyword belongs to the predicate of a while, until, or for loop.
 
pm_state_stack_t accepts_block_stack
 The stack used to determine if a do keyword belongs to the beginning of a block.
 
struct { 
 
   pm_lex_mode_t *   current 
 The current mode of the lexer. More...
 
   pm_lex_mode_t   stack [PM_LEX_STACK_SIZE
 The stack of lexer modes. More...
 
   size_t   index 
 The current index into the lexer mode stack. More...
 
lex_modes 
 A stack of lex modes.
 
const uint8_t * start
 The pointer to the start of the source.
 
const uint8_t * end
 The pointer to the end of the source.
 
pm_token_t previous
 The previous token we were considering.
 
pm_token_t current
 The current token we're considering.
 
const uint8_t * next_start
 This is a special field set on the parser when we need the parser to jump to a specific location when lexing the next token, as opposed to just using the end of the previous token.
 
const uint8_t * heredoc_end
 This field indicates the end of a heredoc whose identifier was found on the current line.
 
pm_list_t comment_list
 The list of comments that have been found while parsing.
 
pm_list_t magic_comment_list
 The list of magic comments that have been found while parsing.
 
pm_location_t data_loc
 The optional location of the END keyword and its contents.
 
pm_list_t warning_list
 The list of warnings that have been found while parsing.
 
pm_list_t error_list
 The list of errors that have been found while parsing.
 
pm_scope_tcurrent_scope
 The current local scope.
 
pm_context_node_tcurrent_context
 The current parsing context.
 
const pm_encoding_tencoding
 The encoding functions for the current file is attached to the parser as it's parsing so that it can change with a magic comment.
 
pm_encoding_changed_callback_t encoding_changed_callback
 When the encoding that is being used to parse the source is changed by prism, we provide the ability here to call out to a user-defined function.
 
const uint8_t * encoding_comment_start
 This pointer indicates where a comment must start if it is to be considered an encoding comment.
 
pm_lex_callback_tlex_callback
 This is an optional callback that can be attached to the parser that will be called whenever a new token is lexed by the parser.
 
pm_string_t filepath_string
 This is the path of the file being parsed.
 
pm_constant_pool_t constant_pool
 This constant pool keeps all of the constants defined throughout the file so that we can reference them later.
 
pm_newline_list_t newline_list
 This is the list of newline offsets in the source file.
 
pm_node_flags_t integer_base
 We want to add a flag to integer nodes that indicates their base.
 
pm_string_t current_string
 This string is used to pass information from the lexer to the parser.
 
int32_t start_line
 The line number at the start of the parse.
 
const pm_encoding_texplicit_encoding
 When a string-like expression is being lexed, any byte or escape sequence that resolves to a value whose top bit is set (i.e., >= 0x80) will explicitly set the encoding to the same encoding as the source.
 
bool command_start
 Whether or not we're at the beginning of a command.
 
bool recovering
 Whether or not we're currently recovering from a syntax error.
 
bool encoding_changed
 Whether or not the encoding has been changed by a magic comment.
 
bool pattern_matching_newlines
 This flag indicates that we are currently parsing a pattern matching expression and impacts that calculation of newlines.
 
bool in_keyword_arg
 This flag indicates that we are currently parsing a keyword argument.
 
pm_constant_id_t current_param_name
 The current parameter name id on parsing its default value.
 
bool semantic_token_seen
 Whether or not the parser has seen a token that has semantic meaning (i.e., a token that is not a comment or whitespace).
 
bool frozen_string_literal
 Whether or not we have found a frozen_string_literal magic comment with a true value.
 
bool suppress_warnings
 Whether or not we should emit warnings.
 

Detailed Description

This struct represents the overall parser.

It contains a reference to the source file, as well as pointers that indicate where in the source it's currently parsing. It also contains the most recent and current token that it's considering.

Definition at line 489 of file parser.h.

Field Documentation

◆ accepts_block_stack

pm_state_stack_t pm_parser::accepts_block_stack

The stack used to determine if a do keyword belongs to the beginning of a block.

Definition at line 518 of file parser.h.

◆ brace_nesting

int pm_parser::brace_nesting

Used to track the nesting of braces to ensure we get the correct value when we are interpolating blocks with braces.

Definition at line 506 of file parser.h.

◆ command_start

bool pm_parser::command_start

Whether or not we're at the beginning of a command.

Definition at line 672 of file parser.h.

◆ comment_list

pm_list_t pm_parser::comment_list

The list of comments that have been found while parsing.

Definition at line 560 of file parser.h.

Referenced by pm_parser_free(), pm_serialize_content(), pm_serialize_lex(), and pm_serialize_parse_comments().

◆ constant_pool

pm_constant_pool_t pm_parser::constant_pool

This constant pool keeps all of the constants defined throughout the file so that we can reference them later.

Definition at line 615 of file parser.h.

Referenced by pm_parser_free(), pm_parser_init(), and pm_serialize_content().

◆ current [1/2]

pm_lex_mode_t* pm_parser::current

The current mode of the lexer.

Definition at line 523 of file parser.h.

Referenced by pm_parser_init().

◆ current [2/2]

pm_token_t pm_parser::current

The current token we're considering.

Definition at line 542 of file parser.h.

◆ current_context

pm_context_node_t* pm_parser::current_context

The current parsing context.

Definition at line 578 of file parser.h.

◆ current_param_name

pm_constant_id_t pm_parser::current_param_name

The current parameter name id on parsing its default value.

Definition at line 694 of file parser.h.

◆ current_scope

pm_scope_t* pm_parser::current_scope

The current local scope.

Definition at line 575 of file parser.h.

Referenced by pm_parser_free().

◆ current_string

pm_string_t pm_parser::current_string

This string is used to pass information from the lexer to the parser.

It is particularly necessary because of escape sequences.

Definition at line 632 of file parser.h.

◆ data_loc

pm_location_t pm_parser::data_loc

The optional location of the END keyword and its contents.

Definition at line 566 of file parser.h.

◆ do_loop_stack

pm_state_stack_t pm_parser::do_loop_stack

The stack used to determine if a do keyword belongs to the predicate of a while, until, or for loop.

Definition at line 512 of file parser.h.

◆ enclosure_nesting

int pm_parser::enclosure_nesting

Tracks the current nesting of (), [], and {}.

Definition at line 494 of file parser.h.

◆ encoding

const pm_encoding_t* pm_parser::encoding

The encoding functions for the current file is attached to the parser as it's parsing so that it can change with a magic comment.

Definition at line 584 of file parser.h.

Referenced by pm_serialize_content(), pm_serialize_lex(), pm_serialize_parse_comments(), and pm_strpbrk().

◆ encoding_changed

bool pm_parser::encoding_changed

Whether or not the encoding has been changed by a magic comment.

We use this to provide a fast path for the lexer instead of going through the function pointer.

Definition at line 682 of file parser.h.

Referenced by pm_strpbrk().

◆ encoding_changed_callback

pm_encoding_changed_callback_t pm_parser::encoding_changed_callback

When the encoding that is being used to parse the source is changed by prism, we provide the ability here to call out to a user-defined function.

Definition at line 591 of file parser.h.

Referenced by pm_parser_register_encoding_changed_callback().

◆ encoding_comment_start

const uint8_t* pm_parser::encoding_comment_start

This pointer indicates where a comment must start if it is to be considered an encoding comment.

Definition at line 597 of file parser.h.

Referenced by pm_parser_init().

◆ end

const uint8_t* pm_parser::end

The pointer to the end of the source.

Definition at line 536 of file parser.h.

◆ error_list

pm_list_t pm_parser::error_list

The list of errors that have been found while parsing.

Definition at line 572 of file parser.h.

Referenced by pm_parse_success_p(), pm_parser_free(), pm_serialize_content(), and pm_serialize_lex().

◆ explicit_encoding

const pm_encoding_t* pm_parser::explicit_encoding

When a string-like expression is being lexed, any byte or escape sequence that resolves to a value whose top bit is set (i.e., >= 0x80) will explicitly set the encoding to the same encoding as the source.

Alternatively, if a unicode escape sequence is used (e.g., \u{80}) that resolves to a value whose top bit is set, then the encoding will be explicitly set to UTF-8.

The next time this happens, if the encoding that is about to become the explicitly set encoding does not match the previously set explicit encoding, a mixed encoding error will be emitted.

When the expression is finished being lexed, the explicit encoding controls the encoding of the expression. For the most part this means that the expression will either be encoded in the source encoding or UTF-8. This holds for all encodings except US-ASCII. If the source is US-ASCII and an explicit encoding was set that was not UTF-8, then the expression will be encoded as ASCII-8BIT.

Note that if the expression is a list, different elements within the same list can have different encodings, so this will get reset between each element. Furthermore all of this only applies to lists that support interpolation, because otherwise escapes that could change the encoding are ignored.

At first glance, it may make more sense for this to live on the lexer mode, but we need it here to communicate back to the parser for character literals that do not push a new lexer mode.

Definition at line 669 of file parser.h.

◆ filepath_string

pm_string_t pm_parser::filepath_string

This is the path of the file being parsed.

We use the filepath when constructing SourceFileNodes.

Definition at line 609 of file parser.h.

Referenced by pm_parser_free(), and pm_parser_init().

◆ frozen_string_literal

bool pm_parser::frozen_string_literal

Whether or not we have found a frozen_string_literal magic comment with a true value.

Definition at line 706 of file parser.h.

Referenced by pm_parser_init().

◆ heredoc_end

const uint8_t* pm_parser::heredoc_end

This field indicates the end of a heredoc whose identifier was found on the current line.

If another heredoc is found on the same line, then this will be moved forward to the end of that heredoc. If no heredocs are found on a line then this is NULL.

Definition at line 557 of file parser.h.

◆ in_keyword_arg

bool pm_parser::in_keyword_arg

This flag indicates that we are currently parsing a keyword argument.

Definition at line 691 of file parser.h.

◆ index

size_t pm_parser::index

The current index into the lexer mode stack.

Definition at line 529 of file parser.h.

Referenced by pm_parser_free().

◆ integer_base

pm_node_flags_t pm_parser::integer_base

We want to add a flag to integer nodes that indicates their base.

We only want to parse these once, but we don't have space on the token itself to communicate this information. So we store it here and pass it through when we find tokens that we need it for.

Definition at line 626 of file parser.h.

◆ lambda_enclosure_nesting

int pm_parser::lambda_enclosure_nesting

Used to temporarily track the nesting of enclosures to determine if a { is the beginning of a lambda following the parameters of a lambda.

Definition at line 500 of file parser.h.

◆ lex_callback

pm_lex_callback_t* pm_parser::lex_callback

This is an optional callback that can be attached to the parser that will be called whenever a new token is lexed by the parser.

Definition at line 603 of file parser.h.

Referenced by pm_serialize_lex(), and pm_serialize_parse_lex().

◆ [struct]

struct { ... } pm_parser::lex_modes

A stack of lex modes.

Referenced by pm_parser_free(), and pm_parser_init().

◆ lex_state

pm_lex_state_t pm_parser::lex_state

The current state of the lexer.

Definition at line 491 of file parser.h.

Referenced by pm_parser_init().

◆ magic_comment_list

pm_list_t pm_parser::magic_comment_list

The list of magic comments that have been found while parsing.

Definition at line 563 of file parser.h.

Referenced by pm_parser_free(), pm_serialize_content(), and pm_serialize_lex().

◆ newline_list

pm_newline_list_t pm_parser::newline_list

This is the list of newline offsets in the source file.

Definition at line 618 of file parser.h.

Referenced by pm_parser_free(), and pm_parser_init().

◆ next_start

const uint8_t* pm_parser::next_start

This is a special field set on the parser when we need the parser to jump to a specific location when lexing the next token, as opposed to just using the end of the previous token.

Normally this is NULL.

Definition at line 549 of file parser.h.

◆ pattern_matching_newlines

bool pm_parser::pattern_matching_newlines

This flag indicates that we are currently parsing a pattern matching expression and impacts that calculation of newlines.

Definition at line 688 of file parser.h.

◆ previous

pm_token_t pm_parser::previous

The previous token we were considering.

Definition at line 539 of file parser.h.

◆ recovering

bool pm_parser::recovering

Whether or not we're currently recovering from a syntax error.

Definition at line 675 of file parser.h.

◆ semantic_token_seen

bool pm_parser::semantic_token_seen

Whether or not the parser has seen a token that has semantic meaning (i.e., a token that is not a comment or whitespace).

Definition at line 700 of file parser.h.

◆ stack

pm_lex_mode_t pm_parser::stack[PM_LEX_STACK_SIZE]

The stack of lexer modes.

Definition at line 526 of file parser.h.

Referenced by pm_parser_init().

◆ start

const uint8_t* pm_parser::start

The pointer to the start of the source.

Definition at line 533 of file parser.h.

Referenced by pm_serialize_content().

◆ start_line

int32_t pm_parser::start_line

The line number at the start of the parse.

This will be used to offset the line numbers of all of the locations.

Definition at line 638 of file parser.h.

Referenced by pm_parser_init(), pm_serialize_content(), pm_serialize_lex(), and pm_serialize_parse_comments().

◆ suppress_warnings

bool pm_parser::suppress_warnings

Whether or not we should emit warnings.

This will be set to false if the consumer of the library specified it, usually because they are parsing when $VERBOSE is nil.

Definition at line 713 of file parser.h.

Referenced by pm_parser_init().

◆ warning_list

pm_list_t pm_parser::warning_list

The list of warnings that have been found while parsing.

Definition at line 569 of file parser.h.

Referenced by pm_parse_success_p(), pm_parser_free(), pm_serialize_content(), and pm_serialize_lex().


The documentation for this struct was generated from the following file: