DataPalooza 2021

First page Table of contents Previous page 152 Next page Last page

Bidirectional Encoder Representations from Transformers (BERT)

• Pre-Training Tasks

• #1 Masked language model • #2 next sentence prediction

• Neural Network • 12 layers, 768 hidden layers, 12 self- attention heads

Halfway Home: What questions do you have so far?

Made with FlippingBook - professional solution for displaying marketing and sales documents online