DeepSeek-R1-Lite 预览版出了,拿这个压箱底的题目,又做了一轮测试:
很可惜,给了R1三次机会都没通过。第1次倒在了测试用例2,第2次出现语法错误,给出报错信息尝试第3次后,依然有语法错误。
同时测了新版的 Claude 3.5 Sonnet,两次都失败,一次倒在测试用例2,另一次倒在测试用例4。
再次测了 o1-mini,发挥依旧稳定,一把过。
看来想超越 o1 系列也没有那么容易。
---
再重复一下测试题目如下:
实现parse_partial_json函数,从不完整json字符串中提取尽可能多的信息,返回字典,通过所有测试用例:
def parse_partial_json(partial_json_str):
"""
从不完整json字符串中提取尽可能多的信息,返回字典
通过test_cases中的所有测试用例,对于空字典,可以返回{}, 也可以返回{'': ''}
需要支持最多3层json嵌套
"""
return
test_cases = [
# only an opening brace
('{', {}, {'': ''}),
# incomplete key
('{"', {}, {'': ''}),
# incomplete key
('{"中', {'中': ''}),
# only key name
('{"中文"', {'中文': ''}),
# key name and colo
('{"中文":', {'中文': ''}),
# key and incomplete value
('{"中文":"你', {'中文': '你'}),
# complete key-value pair and incomplete next key
('{"中文":"你好"', {'中文': '你好'}),
# complete key-value pair and incomplete next key
('{"中文":"你好",', {'中文': '你好'}),
# complete key-value pair and incomplete next key
('{"中文":"你好", "', {'中文': '你好'}),
# complete key-value pair and incomplete next key
('{"中文":"你好", "英', {'中文': '你好', '英': ''}),
# multiple complete key-value pairs and incomplete value
('{"中文":"你好", "英文":"Hel', {'中文': '你好', '英文': 'Hel'}),
# extra characters in the beginning
('下', {}, {'': ''}),
# extra characters in the beginning
('下面是符合要求的json字符串:{"中', {'中': ''}),
# complete JSON and extra characters at the end
('{"中文":"你好", "英文":"Hello"}extra', {'中文': '你好', '英文': 'Hello'}),
# empty JSON object
('{}', {}, {'': ''}),
# incomplete key-value pair with special characters
('{"中文":"你\\u597d', {'中文': '你\u597d'}), # Handles Unicode escape sequence
# special characters and incomplete value
('{"中文":"!@#$', {'中文': '!@#$'}),
# nested JSON and incomplete value
('{"中文":{"问候":"你"', {'中文': {'问候': '你'}}),
# nested JSON and array
('{"中文":["你好","嗨"]', {'中文': ['你好', '嗨']}),
]
# Testing the function
for i, (input_str, *expected_values) in enumerate(test_cases):
output = parse_partial_json(input_str)
assert output in expected_values, f"Test case {i} failed: expected one of {expected_values}, got {output}"
print("All test cases passed!")