Part 1: Kea Migration Assistant support ======================================= Files: ------ - data.h (tailq list and element type declarations) - data.c (element type code) - keama.h (DHCP declarations) - keama.c (main() code) - json.c (JSON parser) - option.c (option tables and code) - keama.8 (man page) The code heavily uses tailq lists, i.e. doubled linked lists with a pointer to the last (tail) element. The element structure mimics the Kea Element class with a few differences: - no smart pointers - extra fields to handle declaration kind, skip and comments - maps are implemented as lists with an extra key field so the order of insertion is kept and duplicates are possible - strings are length + content (vs C strings) There is no attempt to avoid memory leaks. The skip flag is printed as '//' at the beginning of lines. It is set when something cannot be converted and the issue counter (returned by the keama command) incremented. Part 2: ISC DHCP lexer organization =================================== Files: ----- - dhctoken.h (from includes, enum dhcp_token definition) - conflex.c (from common, lexical analyzer code) Tokens (dhcp_token enum): characters are set to their ASCII value, others are >= 256 without real organization (e.g. END_OF_FILE is 607). The state is in a parse structure named "cfile". There is one per file and a few routine save it in order to do a backtrack on a larger set than the usual lookahead. The largest function is intern() which recognizes keywords with a switch on the first character and a tree of if strcasecmp's. Standard routines: ----------------- enum dhcp_token next_token(const char **rval, unsigned *rlen, struct parse *cfile); and enum dhcp_token peek_token(const char **rval, unsigned *rlen, struct parse *cfile); rval: if not null the content of the token is put in it rlen: if not null the length of the token is put in it cfile: lexer context return: the integer value of the token Changes: ------- Added LBRACKET '[' and RBRACKET ']' tokens for JSON parser (switch on dhcp_token type). Added comments to collect ISC DHCP # comments, element stack to follow declaration hierarchy, and issue counter to struct parse. Moved the parse_warn (renamed into parse_error and made fatal) routine from conflex.c to keama.c Part 3: ISC DHCP parser organization ==================================== Files: ----- - confparse.c (from server) for the server in parse_statement()) - parse.c (from common) 4 classes: parameters, declarations, executable statements and expressions. the original code parses config and lease files, I kept only the first at the exception of parse_binding_value(). entry point | V conf_file_parse | V conf_file_subparse <- read_conf_file (for include) until END_OF_FILE call | V parse_statement parse parameters and declarations switch on token and call parse_xxx_declaration routines on default or DHCPv6 token in DHCPv4 mode call parse_executable_statement and put the result under the "statement" key | V parse_executable_statement According to comments the grammar is: conf-file :== parameters declarations END_OF_FILE parameters :== | parameter | parameters parameter declarations :== | declaration | declarations declaration statement :== parameter | declaration parameter :== DEFAULT_LEASE_TIME lease_time | MAX_LEASE_TIME lease_time | DYNAMIC_BOOTP_LEASE_CUTOFF date | DYNAMIC_BOOTP_LEASE_LENGTH lease_time | BOOT_UNKNOWN_CLIENTS boolean | ONE_LEASE_PER_CLIENT boolean | GET_LEASE_HOSTNAMES boolean | USE_HOST_DECL_NAME boolean | NEXT_SERVER ip-addr-or-hostname SEMI | option_parameter | SERVER-IDENTIFIER ip-addr-or-hostname SEMI | FILENAME string-parameter | SERVER_NAME string-parameter | hardware-parameter | fixed-address-parameter | ALLOW allow-deny-keyword | DENY allow-deny-keyword | USE_LEASE_ADDR_FOR_DEFAULT_ROUTE boolean | AUTHORITATIVE | NOT AUTHORITATIVE declaration :== host-declaration | group-declaration | shared-network-declaration | subnet-declaration | VENDOR_CLASS class-declaration | USER_CLASS class-declaration | RANGE address-range-declaration Typically declarations use { } and are associated with a group (changed to a type) in ROOT_GROUP (global), HOST_DECL, SHARED_NET_DECL, SUBNET_DECL, CLASS_DECL, GROUP_DECL and POOL_DECL. ROOT: parent = TOPLEVEL, children = everythig but not POOL HOST: parent = ROOT, GROUP, warn on SHARED or SUBNET, children = none SHARED_NET: parent = ROOT, GROUP, children = HOST (warn), SUBNET, POOL4 SUBNET: parent = ROOT, GROUP, SHARED, children = HOST (warn), POOL CLASS: parent = ROOT, GROUP, children = none GROUP: parent = ROOT, SHARED, children = anything but not POOL POOL: parent = SHARED4, SUBNET, warn on others, children = none isc_boolean_t parse_statement(struct parse *cfile, int type, isc_boolean_t declaration); cfile: parser context type: declaration type declaration and return: declaration or parameter On the common side: executable-statements :== executable-statement executable-statements | executable-statement executable-statement :== IF if-statement | ADD class-name SEMI | BREAK SEMI | OPTION option-parameter SEMI | SUPERSEDE option-parameter SEMI | PREPEND option-parameter SEMI | APPEND option-parameter SEMI isc_boolean_t parse_executable_statement(struct element *result, struct parse *cfile, isc_boolean_t *lose, enum expression_context case_context, isc_boolean_t direct); result: map element where to put the statement cfile: parser context lose: set to ISC_TRUE on failure case_context: expression context direct: called directly by parse_statement so can execute config statements return: success parse_executable_statement switch on keywords (far more than in the comments) on default with an identifier try a config option, on number or name call parse_expression for a function call | V parse_expression expressions are divided into boolean, data (string) and numeric expressions boolean_expression :== CHECK STRING | NOT boolean-expression | data-expression EQUAL data-expression | data-expression BANG EQUAL data-expression | data-expression REGEX_MATCH data-expression | boolean-expression AND boolean-expression | boolean-expression OR boolean-expression EXISTS OPTION-NAME data_expression :== SUBSTRING LPAREN data-expression COMMA numeric-expression COMMA numeric-expression RPAREN | CONCAT LPAREN data-expression COMMA data-expression RPAREN SUFFIX LPAREN data_expression COMMA numeric-expression RPAREN | LCASE LPAREN data_expression RPAREN | UCASE LPAREN data_expression RPAREN | OPTION option_name | HARDWARE | PACKET LPAREN numeric-expression COMMA numeric-expression RPAREN | V6RELAY LPAREN numeric-expression COMMA data-expression RPAREN | STRING | colon_separated_hex_list numeric-expression :== EXTRACT_INT LPAREN data-expression COMMA number RPAREN | NUMBER parse_boolean_expression, parse_data_expression and parse_numeric_expression calls parse_expression and check its result parse_expression itself is divided into parse_non_binary and internal handling of binary operators isc_boolean_t parse_non_binary(struct element *expr, struct parse *cfile, isc_boolean_t *lose, enum expression_context context) isc_boolean_t parse_expression(struct element *expr, struct parse *cfile, isc_boolean_t *lose, enum expression_context context, struct element *lhs, enum expr_op binop) expr: map element where to put the result cfile: parser context lose: set to ISC_TRUE on failure context: expression context lhs: NULL or left hand side binop: expr_none or binary operation return: success parse_non_binary switch on unary and nullary operator keywords on default try a variable reference or a function call parse_expression call parse_non_binary to get the right hand side switch on binary operator keywords to get the next operation with one side if expr_none return else get the second hand handle operator precedence, can call itself return a map entry with the operator name as the key, and left and right expression branches Part 4: Expression processing ============================= Files: ------ - print.c (new) - eval.c (new) - reduce.c (new) Print: ------ const char * print_expression(struct element *expr, isc_boolean_t *lose); const char * print_boolean_expression(struct element *expr, isc_boolean_t *lose); const char * print_data_expression(struct element *expr, isc_boolean_t *lose); const char * print_numeric_expression(struct element *expr, isc_boolean_t *lose); expr: expression to print lose: failure (??? in output) flag return: the text representing the expression Eval: ----- struct element * eval_expression(struct element *expr, isc_boolean_t *modifiedp); struct element * eval_boolean_expression(struct element *expr, isc_boolean_t *modifiedp); struct element * eval_data_expression(struct element *expr, isc_boolean_t *modifiedp); struct element * eval_numeric_expression(struct element *expr, isc_boolean_t *modifiedp); expr: expression to evaluate modifiedp: a different element was returned (still false for updates inside a map) return: the evaluated element (can have been updated for a map or a list, or can be a fully different element) Evaluation is at parsing time so it is mainly a constant propagation. (no beta reduction for instance) Reduce: ------- struct element * reduce_boolean_expression(struct element *expr); struct element * reduce_data_expression(struct element *expr); struct element * reduce_numeric_expression(struct element *expr); expr: expression to reduce return: NULL or the reduced expression as a Kea eval string reducing works for a limited (but interesting) set of expressions which can be converted to kea evaluatebool and for literals. Part 5: Specific issues ======================= Reservations: ------------- ISC DHCP host declarations are global, Kea reservations were per subnet only until 1.5. It is possible to use the fixed address but: - it is possible to finish with orphan reservations, i.e. reservations with an address which match no subnets - a reservation can have no fixed address. In this case the MA puts the reservation in the last declared subnet. - a reservation can have more than one fixed address and these addresses can belong to different subnets. Current code pushes IPv4 extra addresses in a commented extra-ip-addresses but it is legal feature for IPv6. - it is not easy to use prefix6 The use of groups in host declarations is unclear. ISC DHCP UID is mapped to client-id, host-identifier to flex-id Host reservation identifiers are generated on first use. Groups: ------- TODO: search missing parameters from the Kea syntax. (will be done in the third pass) Shared-Networks: ---------------- Waiting for the feature to be supported by Kea. Currently at the end of a shared network declaration: - if there is no subnets it is a fatal error - if there is one subnet the shared-network is squeezed - if there are more than one subnet the shared-network is commented TODO (useful only with Kea support for shared networks): combine permit / deny classes (e.g. create negation) and pop filters to subnets when there is one pool. Vendor-Classes and User-Classes: -------------------------------- ISC DHCP code is inconsistent: in particular before setting the super-class "tname" to "implicit-vendor-class" / "implicit-user-class" it allocates a buffer for data but does not copy the lexical value "val" into it... So I removed support. Classes: -------- Only pure client-classes are supported by kea. Dynamic/deleted stuff is not supported but does it make sense? To spawn classes is not supported. Match class selector is converted to Kea eval test when the corresponding expression can be reduced. Fortunately it seems to be the common case! Lease limit is not supported. Subclasses: ----------- Understood how it works: - (super) class defined with a MATCH (vs. MATCH IF ) - subclasses defined by which are equivalent to MATCH IF EQUAL So subclasses are convertible when the data expression can be reduced. Cf https://kb.isc.org/article/AA-01092/202/OMAPI-support-for-classes-and-subclasses.html which BTW suggests the management API could manage classes... Hardware Addresses: ------------------- Kea supports only Ethernet. Pools: ------ All permissions are not supported by Kea at the exception of class members but in a very different way so not convertible. Mixed DHCPv6 address and prefix pools are not supported, perhaps in this case the pool should be duplicated into pool and pd-pool instances? The bootp stuff was ifdef's as bootp is obsolete. Temporary (aka IA_TA) is commented ny the MA. ISC DHCP supports interval ranges for prefix6. Kea has a different and IMHO more powerful model. Pool6 permissions are not supported. Failover: --------- Display a warning on the first use. Interfaces: ----------- Referenced interface names are pushed to an interfaces-config but it is very (too!) easy to finish with a Kea config without any interface. Hostnames: ---------- ISC DHCP does dynamic resolution in parse_ip_addr_or_hostname. Static (at conversion time) resolution to one address is done by the MA for fixed-address. Resolution is considered as painful there are better (and safer) ways to do this. The -r (resolve) command line parameter controls the at-conversion-time resolution. Note only the first address is returned. TODO: check the multiple address comment is correctly taken (need a known host resolving in a stable set of addresses) Options: -------- Some options are known only in ISC DHCP (almost fixed), a few only by Kea. Formats are supposed to be the same, the only known exception (DHCPv4 domain-search) was fixed by #5087. For option spaces DHCPv4 vendor-encapsulated-options (code 43, in general associated to vendor-class-identifier code 60) uses a dedicated feature which had no equivalent in Kea (fixed). Option definitions are convertible with a few exception: - no support in Kea for an array of records (mainly by the lack of a corresponding syntax). BTW there is no known use too. - no support in Kea for an array at the end of a record (fixed) All unsupported option declarations are set to full binary (X). - X format means ASCII or hexa: * standard options are in general mapped to binary * new options are mapped to string with format x (vs x) * when a string got hexadecimal data a warning in added in comments suggesting to switch to plain binary. - ISC DHCP use quotes for a domain-list but not for a domain-name, this is no very coherent and makes domain-list different than domain-name array. Each time an option data has a format which is not convertible than a CSV false binary data is produced. We have no example in ISC DHCP, Kea or standard but it is possible than an option defined as a fixed sized record followed by (encapsulated) suboptions bugs (it already bugs toElement). For operations on options ISC DHCP has supersede, send, append, prepend, default (set if not yet present), Kea puts them in code order with a few built-in exceptions. To finish there is the way to enforce Kea to add an option in a response is pretty different and can't be automatically translated (cf Kea #250). Duplicates: ----------- Many things in ISC DHCP can be duplicated: - options can be redefined - same host identifier used twice - same fixed address used in tow different hosts etc. Kea is far more strict and IMHO it is a good thing. Now the MA does no particular check and multiple definitions work only for classes (because it is the way the ISC DHCP parse works). If we have Docsis space options, they are standard in Kea so they will conflict. Dynamic DNS: ------------ Details are very different so the MA maps only basic parameters at the global scope. Expressions: ------------ ISC DHCP expressions are typed: boolean, numeric, and data aka string. The default for a literal is to be a string so literal numbers are interpreted in hexadecimal (for a strange consequence look at https://kb.isc.org/article/AA-00334/56/Do-the-list-of-parameters-in-the-dhcp-parameter-request-list-need-to-be-in-hex.html ). String literals are converted to string elements, hexadecimal literals are converted to const-data maps. TODO reduce more hexa aka const-data As booleans are not data there is no way to fix this: /tmp/bool line 9: Expecting a data expression. option ip-forwarding = foo = foo; ^ Cf Kea #247 The tautology 'foo = foo' is not a data expression so is rejected by both the MA and dhcpd (BTW the role of the MA is not to fix ISC DHCP shortcomings so it does what it is expected to do here). Note this does not work too: option ip-forwarding = true; because "true" is not a keyword and it is converted into a variable reference... And I expect ISC DHCP makes this true a false at runtime because the variable "true" is not defined by default. Reduced expressions are pretty printed to allow an extra check. Hardware for DHCPv4 is expansed into a concatenation of hw-type and hw-address, this allows to simplify expression where only one is used. Variables: ---------- ISC DHCP has a notion of variables in a scope where the scope can be a lexical scope in the config or a scope in a function body (ISC DHCP has even an unused "let" statement). There is a variant of bindings for lease files using types and able to recognize booleans and numbers. Unfortunately this is very specific... TODO: - global host reservations - class like if statement - add more tests for classes in pools and class generation