DataPalooza 2021

Bidirectional Encoder Representations from Transformers (BERT)

• Pre-Training Tasks

• #1 Masked language model • #2 next sentence prediction

• Neural Network • 12 layers, 768 hidden layers, 12 self- attention heads

Halfway Home: What questions do you have so far?

Made with FlippingBook Digital Publishing Software