Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fast-discovery-server crash due to uninitialized change pointer upon start-up #5578

Open
1 task done
owillebo opened this issue Jan 15, 2025 · 0 comments
Open
1 task done
Labels
triage Issue pending classification

Comments

@owillebo
Copy link

owillebo commented Jan 15, 2025

Is there an already existing issue for this?

  • I have searched the existing issues

Expected behavior

fast-discovery-server not crashing

Current behavior

fast-discovery-server crashing

Steps to reproduce

Start fast-discovery-server, 20 reader processes reading the same topic and 1 writer process writing the topic.

Depending on the hardware ,1 in about 10 runs will result in fast-discovery-server crashing. On "fast" hardware crash may not happen at all.

Fast DDS version/commit

repositories:
    foonathan_memory_vendor:
        type: git
        url: https://github.com/eProsima/foonathan_memory_vendor.git
        version: v1.3.1
    fastcdr:
        type: git
        url: https://github.com/eProsima/Fast-CDR.git
        version: v2.2.6
    fastdds:
        type: git
        url: https://github.com/eProsima/Fast-DDS.git
        version: v3.1.1
    fastddsgen:
        type: git
        url: https://github.com/eProsima/Fast-DDS-Gen.git
        version: v4.0.3

Platform/Architecture

Other. Please specify in Additional context section.

Transport layer

UDPv4

Additional context

Platform/Architecture: Windows 11 Pro 24H2 Visual Studio 2022

image

Note that a RelWithDebInfo build is used.

...\src\fastdds\src\cpp\rtps\builtin\discovery\participant\PDPServer.cpp

  1500 bool PDPServer::process_to_send_list(
  1501         const std::vector<eprosima::fastdds::rtps::CacheChange_t*>& send_list,
  1502         fastdds::rtps::RTPSWriter* writer,
  1503         fastdds::rtps::WriterHistory* history)
  1504 {
  1505     // Iterate over DATAs in send_list
  1506     std::unique_lock<fastdds::RecursiveTimedMutex> lock(writer->getMutex());
  1507     for (auto change: send_list)
  1508     {
  1508         // If the DATA is already in the writer's history, then remove it, but do not release the change.
  1510         remove_change_from_history_nts(history, change, false);          // <========== Crash on this line.
  1511         // Set change's writer GUID so it matches with this writer
  1512         change->writerGUID = writer->getGuid();
  1513         // Add DATA to writer's history.
  1514         EPROSIMA_LOG_INFO(RTPS_PDP_SERVER, "Adding change from " << change->instanceHandle << " to history");
  1515         eprosima::fastdds::rtps::WriteParams wp = change->write_params;
  1516         history->add_change(change, wp);
  1517     }
>	fastdds-3.1.dll!eprosima::fastdds::rtps::PDPServer::process_to_send_list(const std::vector<eprosima::fastdds::rtps::CacheChange_t *,std::allocator<eprosima::fastdds::rtps::CacheChange_t *>> & send_list, eprosima::fastdds::rtps::RTPSWriter * writer, eprosima::fastdds::rtps::WriterHistory * history) Line 1510	C++
 	fastdds-3.1.dll!eprosima::fastdds::rtps::PDPServer::process_to_send_lists() Line 1471	C++
 	fastdds-3.1.dll!eprosima::fastdds::rtps::PDPServer::server_update_routine() Line 1136	C++
 	fastdds-3.1.dll!eprosima::fastdds::rtps::DServerRoutineEvent::server_routine_event() Line 53	C++
 	[Inline Frame] fastdds-3.1.dll!std::_Func_class<bool>::operator()() Line 920	C++
 	fastdds-3.1.dll!eprosima::fastdds::rtps::TimedEventImpl::trigger(std::chrono::time_point<std::chrono::steady_clock,std::chrono::duration<__int64,std::ratio<1,1000000000>>> current_time, std::chrono::time_point<std::chrono::steady_clock,std::chrono::duration<__int64,std::ratio<1,1000000000>>> cancel_time) Line 98	C++
 	fastdds-3.1.dll!eprosima::fastdds::rtps::ResourceEvent::do_timer_actions() Line 284	C++
 	fastdds-3.1.dll!eprosima::fastdds::rtps::ResourceEvent::event_service() Line 185	C++
 	[Inline Frame] fastdds-3.1.dll!eprosima::fastdds::rtps::ResourceEvent::init_thread::__l2::<lambda_834625574d5ee60e7a19f002d69b0e57>::operator()() Line 326	C++
 	[Inline Frame] fastdds-3.1.dll!eprosima::create_thread::__l2::<lambda_e32a58efa111c8a4d7dbcd0654133ea8>::operator()() Line 109	C++
 	fastdds-3.1.dll!eprosima::thread::ThreadProxy<<lambda_e32a58efa111c8a4d7dbcd0654133ea8>>(void * Ptr) Line 42	C++
 	[External Code]

The change pointer is not initialized, 0 in this case. The change pointer in question is added to the send_list (DiscoveryDataBase::pdp_to_send_) by DiscoveryDataBase::add_own_pdp_to_send_. add_own_pdp_to_send_ is called by PDPServer::assignRemoteEndpoints.

The change is DiscoveryParticipantInfo::change_. The DiscoveryParticipantInfo object is a value in the DiscoveryDataBase::participants_ map with key DiscoveryParticipantInfo::server_guid_prefix_.

Somehow sometimes the DiscoveryParticipantInfo::change_ is uninitialized when DiscoveryDataBase::add_own_pdp_to_send_ adds it to its DiscoveryDataBase::pdp_to_send_ list. When this happens the subsequent call to PDPServer::process_to_send_list will crash as shown.

It was found that this only happens during start-up. The crash can be worked around by letting PDPServer::assignRemoteEndpoints not call DiscoveryDataBase::add_own_pdp_to_send_ for the first 15 calls, see below;

On the left is the workaround.

image

I realize that more info may be needed for analysis. I'm happy to provide any information needed.

XML configuration file

NA

Relevant log output

No response

Network traffic capture

No response

@owillebo owillebo added the triage Issue pending classification label Jan 15, 2025
@owillebo owillebo changed the title fast-discovery-server crash due to uninitialized change pointer fast-discovery-server crash due to uninitialized change pointer upon start-up Jan 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage Issue pending classification
Projects
None yet
Development

No branches or pull requests

1 participant