How does pipeline work for nlp tasks and what do the accuracy scores imply

[First I would like to thank the community of SO; for whose support I could actually finish making an end to end data science project (including deployment.)]

Over to my question.

I wanted to create an app that will predict success or failure depending upon textual input data.

In order to construct the model; instead of developing an algorithm from scratch; I used existing algorithms instead.

While training my dataframe [which is a textual corpus]; I frequently wrote:

pipeline = Pipeline(
    ("vect", CountVectorizer()),
    ("tfidf", TfidfTransformer()),
    ("clf", my_chosen_model()),

That is because I saw several code examples doing so. I however, didn’t clearly understand the usage of pipeline. Can anyone please explain it (or share any link which can help people like me from non-programming backgrounds understand it better.)

After completing my experiments I obtained accuracy scores on text data:

RandomForest    SVM    MLP    MultinomialNB
0.80            0.85   0.86    0.80

When I increased the number of data (from initial 20000 to 50000), accuracy scores were:

RandomForest    SVM    MLP    MultinomialNB
0.85            0.85   0.87    0.82

I also tried BERT; which gave an accuracy of : 0.72 on text data.

Can someone please explain to me here:

  1. Why BERT performed so low – despite that BERT is preferred by many. I followed the exact steps as mentioned here for my task:

  2. What do the accuracy scores imply about different algorithms in regards to data? And will it be wise to choose either MLP / SVM or choose Random forest due to a jump in performance.

Source: Python-3x Questions