Skip to content

2022.12.10 组会: Code Generatoin

Applications

  • test generation
  • code transformation
  • code repair
  • reproduction

Methodology

Data driven

deep learning

ChatGPT: template-based

Random Generation

  • Grammar-based Genetic Programming (AST)
  • limitation: only support several simple grammar
  • Genetic Evolution (R-Lib)
  • CFG-based Genetic Programming
    • e.g., Y = X + X | X * X

Idea

Lexical (REG), Syntax (CFG), Semantics (Property)

  • REG: hard to model
  • CFG: easy to model

y = 3.5, y = REG

REG = {3.5, 4.5, 4.51}

Model a string as a REG.

Semantics: Compile-Run information

Existing code generation tools cannot generate code with semantics.

Methodology

CFG Deriviation

Example

e.g.,

S -> E + T;

T -> (F) | F;

y = 3 + 5 * c -> tree -> Genetic Programming

limitation: lost the CFG information

CFG -> tree Y

tree -> CFG N

So we need carry the CFG information in the tree.