Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

[ISSUE #8875] Fix HAConnection leak#8876

Merged
RongtongJin merged 1 commit intoapache:developapache/rocketmq:developfrom
crazywen:patch-3crazywen/rocketmq:patch-3Copy head branch name to clipboard
Oct 30, 2024
Merged

[ISSUE #8875] Fix HAConnection leak#8876
RongtongJin merged 1 commit intoapache:developapache/rocketmq:developfrom
crazywen:patch-3crazywen/rocketmq:patch-3Copy head branch name to clipboard

Conversation

@crazywen
Copy link
Copy Markdown
Contributor

Which Issue(s) This PR Fixes

Fixes #issue_id

Brief Description

在一些异常场景,可以模拟对ha端口的探活测试,由于conn.start会马上启动线程,在足够快失败的场景,会先执行removeConnection动作,而后再执行addConnection动作,最终造成connection泄漏,滞留 HaService 的connectionList里面,累积造成内存溢出风险。

How Did You Test This Change?

使用python脚本进行ha端口10912的高频探活验证是否还存在内存泄漏

-- coding:UTF-8 --

import sys
import socket
import time
sys.path.append(".")
def tcp_health_check(host, port):
now = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
try:
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.settimeout(1)
sock.connect((host, port))
sock.sendall(now.encode())
print("{}TCP连接成功:{}:{}".format(now,host,port))
sock.close()
return True
except socket.error as e:
print("TCP连接失败: {}".format(e))
if sock:
sock.close()
return False

def main():
host = '127.0.0.1'
port = 8081

while True:
    result = tcp_health_check('127.0.0.1' , 10912)
    time.sleep(0.5)

if name == "main":
main()

在一些异常场景,可以模拟对ha端口的探活测试,由于conn.start会马上启动线程,在足够快失败的场景,会先执行removeConnection动作,而后再执行addConnection动作,最终造成connection泄漏,滞留 HaService 的connectionList里面,累积造成内存溢出风险。
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Oct 30, 2024

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 47.56%. Comparing base (dd62ed0) to head (d92fbbf).
⚠️ Report is 414 commits behind head on develop.

Additional details and impacted files
@@              Coverage Diff              @@
##             develop    #8876      +/-   ##
=============================================
- Coverage      47.63%   47.56%   -0.07%     
+ Complexity     11756    11734      -22     
=============================================
  Files           1304     1304              
  Lines          91043    91043              
  Branches       11675    11675              
=============================================
- Hits           43364    43301      -63     
- Misses         42346    42401      +55     
- Partials        5333     5341       +8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@RongtongJin RongtongJin changed the title HAConnection 泄漏修复 Fix HAConnection leak Oct 30, 2024
@RongtongJin RongtongJin changed the title Fix HAConnection leak [ISSUE #8875] Fix HAConnection leak Oct 30, 2024
@RongtongJin RongtongJin merged commit fe80772 into apache:develop Oct 30, 2024
@wangshuai67
Copy link
Copy Markdown

wangshuai67 commented Apr 8, 2026

确实存在这个问题我们遇到了,观察了好久,以为是消息积压,但是发现在消息未积压的情况下也发生了OOM.而且很有规律大概20天作用OOM一次。由于我们生产环境加了RocketMQ 的健康检查通过 通过端口探测是否存活,导致HAconnect 占了大量内存。后续我把dump 分析截图放上来

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] HAConnection poses a risk of leakage

4 participants

Morty Proxy This is a proxified and sanitized view of the page, visit original site.