Tech Explained: New Jersey to Use AI to Score Standardized Writing Tests in Simple Terms

Tech Explained: Here’s a simplified explanation of the latest technology update around Tech Explained: New Jersey to Use AI to Score Standardized Writing Tests in Simple Termsand what it means for users..

(TNS) — Artificial intelligence will be used to score most of the writing New Jersey students do on the new statewide standardized tests set to debut this spring, state education officials said.

The AI system will be used to grade student essays and short answers on the English Language Arts section of the statewide exams, according to a state-approved testing proposal. The “artificial intelligence” will be trained using scores generated by human scorers on practice tests that were given to students in October and November.

New Jersey is debuting a new type of state tests — called the New Jersey Student Learning Assessments-Adaptive — this spring. It will be given to students in grades 3 through 10 to test their knowledge of English, math and science.

There will also be a new version of the state’s high school exit exam for high school juniors, now called the New Jersey Graduation Proficiency Assessment-Adaptive.

Like the previous version of the test, known as the NJSLA, the exams will be given via computer. But the new version will be “adaptive,” meaning students will get different questions based on their previous answers on the exam — a practice that is supposed to make scoring the tests more precise.

The AI system will be used to score the essays and written questions, but there will still be some human scorers, state Department of Education Spokesperson Michael Yaple said.

If a student’s written response is identified as “unusual” or “borderline” it will be “flagged for human review,” Yaple said.

“The system regularly conducts quality assurance checks to ensure that the scores assigned by the automated scoring engine match human scores through strict quality controls,” he added.

Cambium, the company overseeing the new tests, does not use generative AI — the version of artificial intelligence used in ChatGPT-type platforms that can create something new and are known to sometimes hallucinate false or inaccurate information, Yaple said.

Instead, the automated scoring system will have strict parameters “with proven consistency, and human scoring remains the foundation of the process, validating accuracy at multiple checkpoints throughout the scoring workflow,” state education officials said in a statement.

Computerized scoring of New Jersey’s state tests is nothing new. Last year, about 90 percent of student essays on the NJSLA and the state high school exit exams were scored solely by an automated scoring system, Yaple said.

But some educators have concerns about the extensive use of AI to grade the new version of the tests that will eventually be taken by nearly all of New Jersey’s 1.3 million public school students.

Using a version of AI to score student writing is risky, said Steve Beatty, president of the New Jersey Education Association, the state’s largest teachers union.

He said he would hate to see “some student fail on a computer-graded test only to find out later on that there was some sort of error.”

The NJEA is against high stakes testing in general, Beatty said. But if the tests are going to continue “then we want trained educators — humans — doing” the scoring.

If a student fails the AI-scored sections of the exams, there should be a plan to have the writing reassessed by a human, he said.

“They should go back to a person to be verified,” Beatty said.

NEW TESTING CONTRACT

New Jersey students will begin taking the new NJSLA-Adaptive exams during a month-long testing window between April 27 and May 29. The tests are usually given over several consecutive days.

The testing window for the new NJGPA-Adaptive high school exit exam for high school juniors will be from March 16 to April 1, according to a state Department of Education testing schedule.

The new statewide NJSLA and NJGPA tests were developed by Cambium Assessment, a company that won a $58.7 million, two-year contract with the state.

According to the Cambium proposal, Measurement Incorporated, a company located in Durham, North Carolina, will be responsible for providing and training the people who will do the human “handscoring” when AI-generated essay and written response scores are flagged for review.

In its proposal to the state, Cambium said the company assumes “25 percent of the overall responses will be routed for trained handscoring.”

New Jersey officials said AI was not used to create test items on the new version of the tests and artificial intelligence will not be used to determine which questions students see on the adaptive assessments.

Jeffrey Hauger, who served as director of assessments for the state Department of Education from 2010 to 2018, said New Jersey has a long history of using computers to help score the written portion of state tests. He later worked as an adviser to Pearson, the company that previously had the contract to provide the state NJSLA tests.

Around 2016, Hauger said the state started implementing a system that used one human and one automated scorer to assess each piece of student writing.

If a large discrepancy between the two scores was found, the essay would be read by a second human, he said.

“It was a tool for efficiency, but the human was always involved throughout the process back then,” Hauger said.

AI scoring is now more sophisticated, he said.

“Technology has improved. And so, it’s not as big of a leap now as maybe people think it is,” Hauger said.

During Gov. Phil Murphy’s time in office, the department started relying more on automated scoring and moving away from having each piece of writing evaluated by both a machine and a human, he said.

FLAGGING PROBLEMS

AI scoring has been controversial in other states.

In Massachusetts, AI grading errors were blamed for 1,400 incorrect scores on the state’s Massachusetts Comprehensive Assessment System, known as the MCAS, last year.

In Texas, several districts questioned whether AI grading was fair on its statewide tests in recent years.

The Dallas Independent School District has challenged thousands of AI generated essay scores on Texas’ statewide STAAR standardized tests over the past two years.

Cambium and Pearson, the companies involved in New Jersey’s testing, both contributed to Texas’ standardized testing system.

In 2024, the Dallas school district asked the state to rescore 4,600 tests, sending them to the state to be rescored by humans.

About 44 percent of the rescored tests came back with higher scores after a human read them, said Jacob Cortez, Dallas’ assistant superintendent in charge of evaluation and assessment.

The district also sent thousands of AI-scored tests for rescoring last year and nearly 40 percent came back with higher scores from humans, the district said.

The accuracy rate for the AI-scored third grade tests was the most troubling, with 85 percent of those sent back showing an improved score when humans read the students’ work.

“That is not okay,” Cortez said.

The Dallas school district, which serves about 139,000 students, limited the number of tests it sent back for rescoring because it had to pay $50 for each test that did not receive an improved score, local officials said.

Cambium officials did not respond to requests for comment about the Dallas accuracy issues or the company’s AI scoring practices.

New Jersey officials declined to comment on questions about AI scoring accuracy in other states.

“New Jersey cannot comment on another state’s assessment and scoring process,” Yaple said.

Lily Laux, New Jersey’s new commissioner of education, also did not respond to a request to comment. In her previous job as Texas’ deputy commissioner of school programs, she helped design the state’s standardized testing system, according to her LinkedIn profile.

The problems with AI scoring in Dallas raise questions about the system, said Scott Marion, principal learning associate at the Center for Assessment, a nonprofit, nonpartisan consulting firm.

“Is it not being trained well? Is it not being trained on a diverse enough population?” Marion asked.

AI scoring makes financial sense but states also need to be careful not to overly rely on it, he said. He’s comfortable with about 80 percent AI-scored writing because systems still need human backups.

“We’ve been doing this for so long,” he said referring to the use of AI to score student writing.

Many students, teachers and parents may be surprised to know how much of writing in school is already scored by AI, education advocates said.

Many “parents have no idea this is a thing,” said Julie Borst, executive director of community organizing for Save Our Schools New Jersey, a statewide advocacy group.

She is concerned that students with unique writing styles might end up with lower scores on tests because AI is looking for specific words and phrases or a standard number of sentences for top scores.

Borst, whose organization has long-opposed high stakes standardized testing, said in the end, it will still be up to teachers to know where students are doing well and where they are struggling.

“The teacher is going to know where those weaknesses are. They’re going to know where those strengths lie,” she said. “You cannot tell that — at the student level — from a standardized test.”

Tech Explained: New Jersey to Use AI to Score Standardized Writing Tests in Simple Terms

ByBuzz Reporter

NEW TESTING CONTRACT

FLAGGING PROBLEMS

By Buzz Reporter

Related Post

Tech Explained: Ontario Tech researcher exploring how AI can help keep connected technologies secure in Simple Terms

Tech Explained: Trump’s AI Export Plan Could Clash With Global Tech Sovereignty in Simple Terms

Tech Explained: Anthropic Battles Pentagon: AI Firm Challenges National Security Blacklist in Simple Terms

You missed

Breaking News:iPhone Fold shows up in updated CAD renders– What Just Happened

Explained : India’s Geopolitical Tightrope: Navigating West Asia’s Tumult and Its Impact

Market Update: US reverses 5-year economic freedom decline with largest increase since 2001 – Full Analysis

Tech Explained: Ontario Tech researcher exploring how AI can help keep connected technologies secure in Simple Terms