DataPalooza 2021
Bidirectional Encoder Representations from Transformers (BERT)
• Pre-Training Tasks
• #1 Masked language model • #2 next sentence prediction
• Neural Network • 12 layers, 768 hidden layers, 12 self- attention heads
Halfway Home: What questions do you have so far?
Made with FlippingBook Digital Publishing Software