Empty lines doesn't work if the data ends with a empty line #10

EmilHvitfeldt · 2021-04-19T03:52:33Z

library(ggpage)
library(tidytext)
library(tidyverse)

text <- "Modeling as a statistical practice can encompass a wide variety of activities. 
This book focuses on supervised or predictive modeling for text, using text data 
to make predictions about the world around us. We use the tidymodels framework 
for modeling, a consistent and flexible collection of R packages developed to 
encourage good statistical practice.

Supervised machine learning using text data involves building a statistical 
model to estimate some output from input that includes language. The two types 
of models we train in this book are regression and classification. Think of 
regression models as predicting numeric or continuous outputs, such as 
predicting the year of a United States Supreme Court opinion from the text of 
that opinion. Think of classification models as predicting outputs that are 
discrete quantities or class labels, such as predicting whether a GitHub issue 
is about documentation or not from the text of the issue. Models like these can
be used to make predictions for new observations, to understand what features 
or characteristics contribute to differences in the output, and more. We can 
evaluate our models using performance metrics to determine which are best, which 
are acceptable for our specific context, and even which are fair."

tibble(text = text) %>%
  unnest_tokens(text, text, token = function(x) str_split(x, "\n")) %>%
  ggpage_quick()
#> Warning: Use of `data_1$x_space_right` is discouraged. Use `x_space_right`
#> instead.
#> Warning: Use of `data_1$x_page` is discouraged. Use `x_page` instead.
#> Warning: Use of `data_1$x_space_left` is discouraged. Use `x_space_left`
#> instead.
#> Warning: Use of `data_1$x_page` is discouraged. Use `x_page` instead.
#> Warning: Use of `data_1$line` is discouraged. Use `line` instead.
#> Warning: Use of `data_1$y_page` is discouraged. Use `y_page` instead.
#> Warning: Use of `data_1$line` is discouraged. Use `line` instead.
#> Warning: Use of `data_1$y_page` is discouraged. Use `y_page` instead.

text <- "Modeling as a statistical practice can encompass a wide variety of activities. 
This book focuses on supervised or predictive modeling for text, using text data 
to make predictions about the world around us. We use the tidymodels framework 
for modeling, a consistent and flexible collection of R packages developed to 
encourage good statistical practice.

Supervised machine learning using text data involves building a statistical 
model to estimate some output from input that includes language. The two types 
of models we train in this book are regression and classification. Think of 
regression models as predicting numeric or continuous outputs, such as 
predicting the year of a United States Supreme Court opinion from the text of 
that opinion. Think of classification models as predicting outputs that are 
discrete quantities or class labels, such as predicting whether a GitHub issue 
is about documentation or not from the text of the issue. Models like these can
be used to make predictions for new observations, to understand what features 
or characteristics contribute to differences in the output, and more. We can 
evaluate our models using performance metrics to determine which are best, which 
are acceptable for our specific context, and even which are fair.
"

tibble(text = text) %>%
  unnest_tokens(text, text, token = function(x) str_split(x, "\n")) %>%
  ggpage_quick()
#> Warning: Use of `data_1$x_space_right` is discouraged. Use `x_space_right`
#> instead.
#> Warning: Use of `data_1$x_page` is discouraged. Use `x_page` instead.
#> Warning: Use of `data_1$x_space_left` is discouraged. Use `x_space_left`
#> instead.
#> Warning: Use of `data_1$x_page` is discouraged. Use `x_page` instead.
#> Warning: Use of `data_1$line` is discouraged. Use `line` instead.
#> Warning: Use of `data_1$y_page` is discouraged. Use `y_page` instead.
#> Warning: Use of `data_1$line` is discouraged. Use `line` instead.
#> Warning: Use of `data_1$y_page` is discouraged. Use `y_page` instead.

^{Created on 2021-04-18 by the reprex package (v1.0.0)}

EmilHvitfeldt changed the title ~~Empty lines doesn~~ Empty lines doesn't work if the data ends with a empty line Apr 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Empty lines doesn't work if the data ends with a empty line #10

Empty lines doesn't work if the data ends with a empty line #10

EmilHvitfeldt commented Apr 19, 2021 •

edited

Empty lines doesn't work if the data ends with a empty line #10

Empty lines doesn't work if the data ends with a empty line #10

Comments

EmilHvitfeldt commented Apr 19, 2021 • edited

EmilHvitfeldt commented Apr 19, 2021 •

edited