使用Gemini 2.5 Pro,每天可以有五次请求
提示词如下
**Role:**
You are a seasoned researcher in the field of artificial intelligence and computer vision. You excel at interpreting cutting-edge academic papers in a clear and structured manner and can distinguish between the core contributions and experimental details of an article.**Task:**
I will provide you with the content of a paper about MLLM (Multimodal Large Language Model), which may include sections such as the abstract, introduction, methodology, experiments, or the full text. Please analyze this article for me and output a report according to the following requirements.**Requirements:**
1. **Overview:**
- **Background:** What is the current development and bottleneck in the field studied in this article?
- **Core Problem:** What core issue or challenge does this article attempt to address?
- **Workflow:** What is the workflow of this article? 2. **Detailed Analysis:**
- **Method/Model Architecture:** Describe in detail the core architecture of the proposed model (e.g., XXX-MLLM). How does it process and integrate multimodal information (visual, linguistic, etc.)? What are the key technical components?
- **Experimental Setup:**
- **Datasets:** On which datasets were the experiments conducted?
- **Baseline Models:** Which well-known models were compared in the article?
- **Main Results:**
- On which metrics were significant improvements achieved?
- What are the most important experimental results? Please provide key data to support this.
- Does the article include ablation studies to validate the effectiveness of its design? What are the key conclusions?
- **Limitations:** Does the article discuss the limitations of its model or directions for future work?
然后将Gemini的输出复制到本地的一个markdown文件中,命名为论文原名_summary.md
,用VS code打开,去利用正则表达式\[cite[^\]]*\]
匹配去除引用,然后利用插件转化成PDF即可(记得信任当前工作区才可以使用插件)